Business Continuity Plan #
The Business Continuity plan aims to minimize interruptions to normal operations, limit the extent of disruptions and damage in disasters, and establish alternative means of operation in the event of emergencies.
The Business Continuity describes the types of disruptions, the roles of key personnel in continuity planning and disruption response, the applications that could be disrupted, and the general strategies for ensuring business continuity.
Scope #
This policy affects all employees of this Nested Knowledge and all contractors, consultants, temporary employees and business partners. Business Continuity Planning applies to internal company practices, as well as application delivery. Possible disruptions include product outages, internet outages, economic disruption, loss of key personnel, cyberattacks, and negative publicity.
Examples of Disruptions #
- External Product outage
- File Share goes down
- Unplanned internet outage
- Data loss
- Hardware/software failures
- Economic disruption
- Recession
- Turnover of critical employees
- Cyberattacks
- Negative Publicity (Reputation)
Application Profile #
Name | Manufacturer | Critical to Business? | Critical to application? | Comments |
---|---|---|---|---|
AWS | Amazon | Yes | Yes | Essential for running AutoLit/Synthesis |
NPM | Microsoft | Yes | Yes | Essential for building production deployments. In the event of repository outage, dependencies may be transferred from backups via FTP. |
PyPi | Yes | Yes | Essential for building production deployments. In the event of repository outage, dependencies may be transferred from backups via FTP. | |
Auth0 | Yes | Yes | Essential for providing authorization & username/password management to all users. | |
Stripe | No | No | Stripe enables pay-on-the-site. Both paying and non-paying users may continue accessing the site in the event of an outage, and payments & subscriptions may be manually managed by the NK team in the event of a long-term outage. | |
Google Suite | Yes | No | In the event of an email disruption, we will shift to Outlook-based or other email platforms. In the event of a disruption to Google Meets, we will utilize Zoom for video calls. In the event of a document storage disruption, we will utilize Box for storing company documents. | |
Toggl | No | No | Used for employee and contractor time tracking. If a disruption occurs, we will require manual time tracking | |
Gusto | Yes | No | Essential for payroll and benefits. | |
QuickBooks | Yes | No | Essential for storing financial information. | |
Slack | No | No | Utilized for business communication. If a significant disruption occurs, we will switch instant messaging to the chat application Signal. | |
GitLab | Yes | Yes | If a temporary disruption occurs, we will employ FTP & patch files. | |
Pubmed Entrez API | No | No | When a disruption occurs, manual and recurrng searches fail. Upon recovery, our system automatically begins rerunnning scheduled failed searches. | |
Unpaywall | No | No | When a disruption occurs, the full text import feature is shown as “Not Available” on site. | |
HubSpot | No | No | ||
Adobe Creative Cloud | Yes | No | (Photoshop, Illustrator, InDesign, After Effects, Premiere Pro) | |
Adobe Reader | No | No | In the event of a disruption to Adobe Reader, we will switch to Docusign. | |
OBS Studio | No | No | ||
Metabase | No | No | Include sensitive and confidential data. | |
Scite | Yes | Yes | When a disruption occurs, the scite badge no longer displays. | |
ClinicalTrials.gov | Yes | Yes | When a disruption occurs, manual and recurring searches fail, and NCTID bibliomining will fail. Upon recovery, our system automatically begins rerunnning scheduled failed searches. | |
EuropePMC | Yes | Yes | When a disruption occurs, manual and recurring searches fail. Upon recovery, our system automatically begins rerunnning scheduled failed searches. | |
DOAJ | Yes | Yes | When a disruption occurs, manual and recurring searches fail. Upon recovery, our system automatically begins rerunning scheduled failed searches. | |
Abstra | Abstra | Yes | No | Disruptions may impact the timeliness of customer support actions. |
EQVista | Yes | No | Disruptions may impact stockholder equity management |
Roles and Contacts #
Name | Title | Role/Function | Contact Information |
Kevin Kallmes | CEO | Executive decisions; personnel management | kevinkallmes@supedit.com 507-271-7051 |
Karl Holub | CTO | Technical Lead | karl.holub@nested-knowledge.com |
Kathryn Cowie | COO | Operational support | kathryn.cowie@nested-knowledge.com 301-272-0957 |
Business Continuity Strategies #
Loss of Function of Critical Applications #
- In the case of the loss of functionality to AutoLit or Synthesis for 30 or more minutes, the CTO will be notified, and Nested Knowledge will send a site disruption message to all users. The CTO and development team will assess the extent of any lost capabilities and timeline to restoration, and then communicate with company leadership regarding a recovery plan of specific functions.
- In the case of the loss of functionality to any other key/critical applications, the CTO will be notified; Site disruption messages will only be sent to users in the case that this impacts end user functions. In consultation with company leadership, the CTO and development team will create a plan to either restore function or shift to a different software provider.
- In case of outages, the CEO or another leader will email account representatives for customers with a proposed restoration timeline and details regarding the outage.
Recession Planning #
- Our finances are based on private funding and revenue. Our costs are based on already-negotiated contracts with employees and contractors. In the event of a recession, the company would be open to federal support (such as the Payroll Protection Plan) or bank loans, but should not need to dramatically alter financing.
Loss of Key Personnel #
- In the event that Nested Knowledge loses our CTO, we will elevate our head engineer to replace the duties and hire an additional engineer as soon as feasible.
- In the event that Nested Knowledge loses our CEO, the president will serve as the CEO.
- In the event that Nested Knowledge loses our COO, we will hire an already trained Operations Manager and bookkeeper to aid with record keeping and financial operations.
Compliance Statement #
All personnel who access Nested Knowledge’s information systems will be provided with and required to review this document. Personnel with central roles in business continuity planning will undergo annual training to ensure competence with business continuity procedures.
Bankruptcy or Acquisition #
In the event of imminent bankruptcy or closing of services, Nested Knowledge will provide clients with 90 days to export data from the Nested Knowledge application. In the event of an acquisition, clients will continue to have access to their data and functionality.
Business Impact Analysis (BIA) #
On an annual basis, the company will perform a business impact analysis to evaluate business activities to determine how critical they are for business and product operations. A BIA quantifies the impacts of disruptions on service delivery, risks to service delivery, and recovery time objectives. To complete a BIA, Nested Knowledge will examine criticality, resources, and priorities:
i) Criticality – Examine the impact of a system disruption to critical business processes.
ii) Resource – Determine resources required to resume business processes as quickly as possible. Examples of resources that should be identified include facilities, personnel, equipment, software, data files, system components, and vital records.
iii) Priorities – Establish priority levels for recovery activities and resources.
Updating and Review #
The BIA should:
- Undergo scheduled review annually for applicability and appropriate criticality, resourcing, and prioritization.
- Undergo changes with major changes to the business or its products, including but not limited to the launch of a new software product, an integration with an existing software product, or the creation of any new services based on the product.
- Undergo changes with major changes to the ownership or oversight, including but not limited to acquisition, joint venture, or transfer of 51% of the voting shares in the company.
Estimating Downtime #
Recovery Time Objective (RTO) and Recovery Point Objective (RPO) RTO defines the maximum amount of time that a system resource can remain unavailable before there is an unacceptable impact on other system resources, supported mission/business processes, and the MTD. Determining the information system resource RTO is important for selecting appropriate technologies that are best suited for meeting the MTD.
Resource | RTO | RPO | Comments |
---|---|---|---|
Application Code (site-wide functionality outage) | 30 minutes | N/A | Bugs are most likely to be caught in verification immediately after deployment (15 minutes). In this event, the release is rolled back (5 minutes) and additional time provided for any database schema rollbacks (10 minutes). |
Critical Databases | 15 minutes | 5 minutes | Transaction logs are streamed to a backup on AWS RDS. A new instance may be provisioned from an arbitrary timepont (10 minutes) and the private DNS record updated (5 minutes). |
Critical Servers | 30 minutes | N/A | New compute images have a scripted provisioning (15 minutes) and run a deploy inside 10 minutes. |
AWS (permanent outage) | 40 hours | 12 hours | This entry highlights a worst-case scenario: a permanent AWS outage requiring transfer of our services to a different cloud services provider (planned: Google Cloud). Time is allocated for provisioning of compute, load balancing, & database resources, transfer of database backups, DNS record transfer (or temporary new record creation), network configuration. Database backups are performed twice daily to an offsite, giving an RPO of 12 hours. |
AWS (transient outages) | We defer to AWS’s SLAs for service outages that do not require as serious action as a full transfer away. Services relevant to NK are Compute (servers), Databases, and Networking and Content Delivery (VPC, firewall, DNS). |
Maximum Tolerable Downtime (MTD): #
For any cause: 48 hours. This estimate represents the RTO for a worst-case failure (permanent outage & transfer off of our current cloud provider), plus an 8 hour Work Recovery Time (WRT) verifying the new system.
Disaster Planning and Recovery #
This plan explains Nested Knowledge’s procedure for mitigating disruption of product and services delivery when disruption due to disaster occurs. In the event of an actual emergency situation, modifications to this document may be made to ensure physical safety of our people, our systems, and our data. This plan describes the level of business disruption which could arise from each type of disaster.
Potential Disaster | Likelihood | Consequence | Remedial Actions |
---|---|---|---|
Pandemic | Highly Possible | Minor | No onsite location at risk; we will continue to build products and provide services in pandemics. |
Act of Terrorism | Possible | Minor | No onsite location at risk; however, a terrorist attack may disrupt personnel hours and availability or impact data centers. This risk is managed by AWS. |
Hurricane | Unlikely | Minor | No onsite location at risk; however, a hurricane may disrupt personnel hours and availability or impact data centers. This risk is managed by AWS and GCP. |
Fire | Unlikely | Minor | No onsite location at risk; however, a fire may disrupt personnel hours and availability or impact data centers. This risk is managed by AWS. |
Tornado | Unlikely | Minor | No onsite location at risk; however, a tornado may disrupt personnel hours and availability or impact data centers. This risk is managed by AWS. |
Disruption of servers | Unlikely | Major | This risk is managed by AWS. We operate out of multiple availability zones to increase resiliency to a single data center outage. |
Emergency and Disaster Recovery Team #
The disaster recovery team consists of Kevin Kallmes, Karl Holub, Kathryn Cowie. In the event of an emergency, the team’s responsibilities include:
- Respond immediately to a potential disaster and call emergency services
- Assess the extent of the disaster and its impact on the business.
- Notify employees and allocate responsibilities and activities as required
- Restore critical services within four business hours of the incident.
- Recover to business as usual within 8 to 24 hours after the incident
Communication and Notifications #
Notification of Emergency #
The person discovering the incident should call or email a member of the Emergency and Disaster Recovery Team immediately.
Contact with Employees #
Managers will serve as the focal points for their departments, while designated employees will call other employees to discuss the crisis/disaster and the company’s immediate plans. Employees who cannot reach staff are advised to call the staff member’s emergency contact to relay information on the disaster.
Personnel/Family Notification #
If the incident has resulted in a situation which would cause concern to an employee’s immediate family such as hospitalization of injured persons, it will be necessary to notify their immediate family members quickly.
Media Contact #
If applicable, assigned personnel will coordinate with the media, working according to guidelines that have been previously approved and issued for dealing with post-disaster communications.
Insurance Requirements #
As a mitigation of financial risk, legal exposure, data privacy breach, and other key company functions, Nested Knowledge maintains General Business / Professional Liability Insurance, Cyber Incident Insurance, Network Security and Privacy Liability Insurance, and System Damage and Business Interruption Insurance. A corticate of insurance is available for clients upon request.
Finances and Legal Action #
Financial Assessment #
The emergency response team shall prepare an initial assessment of the impact of the incident on the financial affairs of the company. The assessment should include:
- Loss of financial documents
- Loss of revenue
- Theft of check books, credit cards, etc.
- Loss of cash
Financial Requirements #
The immediate financial needs of the company must be addressed. These can include:
- Cash flow position
- Temporary borrowing capability
- Upcoming payments for taxes, payroll taxes, Social Security, etc.
- Availability of company credit cards to pay for supplies and services required post- disaster.
Legal Actions #
The company lawyer and Emergency and Disaster Response Team will jointly review the aftermath of the incident and decide whether there may be legal actions resulting from the event; in particular, the possibility of claims by or against the company for regulatory violations, etc.
Tabletop Exercises and Disaster Scenario Planning #
On an annual basis, the executive and engineering management teams will independently develop a set of 10 potential disruptive and disaster scenarios to our product, resources, and external dependencies. 5 scenarios will be randomly selected with the scenario moderator acting as moderator for each exercise. The moderator will accept planned actions & team-level assignments and return outcomes, optionally adding in modifying information or new developments.
Scenarios are carried forward year to year.
Revision History #
Author | Date of Revision/Review | Comments |
---|---|---|
K. Cowie | 10/14/2024 | Revised, removed Carta, added hurricane risk. |
K. Kallmes | 11/19/2021 | 2021 version finalized and signed off |
K. Holub | 06/25/2022 | Added a new supplier |
P. Olaniran | 10/24/2022 | Reviewed w/ Kevin K., Karl H., Kathryn C. |
K. Kallmes | 1/26/2023 | Reviewed BIA |