← Back to Home

Here's Exactly What Happens to Your Documents

You handle confidential study documents — protocols with proprietary drug information, monitoring plans with site-level details, investigator's brochures with unpublished safety data. Some of these are covered by confidentiality agreements with your clients. We understand what's at stake.

That's why we're not going to give you vague reassurances. Instead, we're going to show you exactly what happens to your documents at every stage — from upload to output delivery to deletion. You evaluate data integrity controls professionally. We're going to give you the same level of detail you'd expect in an audit.

We believe transparency is better than assurance. Here is the complete data flow for every document you upload.

Your Document's Complete Lifecycle

Every document uploaded to GxP Prep AI follows this exact path. There are no background processes, no hidden data flows, and no exceptions.

1

Upload — Your Browser to Our Server

When you select a file and click upload, the document is transmitted from your browser to our application server via HTTPS (TLS 1.2+). This is the same encryption standard used by banks and healthcare platforms.

Your document is encrypted in transit via TLS. It arrives at our Vercel-hosted serverless function.
The document is not written to any database, file system, or persistent storage. It exists only in the server's working memory.
2

Parsing — Text Extraction in Memory

The serverless function extracts the text content from your document. This extraction happens entirely in server memory. The original file and extracted text are held in memory only — they are never written to disk or to any database.

Text is extracted from your document in the server's working memory (RAM). The original file format is discarded after text extraction.
No file is saved to any disk, database, object storage, or backup system. No copy of your document is created anywhere.
3

AI Processing — Text Sent to Anthropic's Claude API

The extracted text is sent to Anthropic's Claude API for analysis and audit package generation. This means your document's text content does leave our server and is transmitted to Anthropic's infrastructure for processing. We use this section to be fully transparent about this step.

Your document's text is sent to Anthropic's API via an encrypted HTTPS connection. Anthropic's API terms state that they do not use API inputs to train their models and do not retain API inputs beyond the duration needed to process the request.
We cannot claim that your document text never leaves our infrastructure — it does, during this step. We are transparent about this because you deserve to know exactly where your data goes.
4

Validation — Citation Checking Against the RKG

The AI-generated audit content is checked against our Regulatory Knowledge Graph. During this step, the AI output (not your original document) is compared against our database of verified regulatory text.

AI-generated citations are compared against verified regulatory source text. Each citation is tagged with a confidence level.
Your original document text is not stored during validation. Only the AI-generated output is processed in this step.
5

Output Generation — Your Audit Package Is Built

The validated content is formatted into a structured Word document with epistemic tags, regulatory citations, and a validation summary. This document is generated in server memory and streamed to your browser as a download.

A formatted .docx file is generated in memory and sent directly to your browser for download.
The generated document is not stored on our servers. Once the download is delivered, the output is discarded from memory.
6

Deletion — Everything Is Discarded

After your audit package is delivered, the serverless function completes and its memory is released. Your uploaded document, the extracted text, the AI-generated content, and the formatted output all cease to exist on our infrastructure.

All data associated with your request — uploaded file, extracted text, AI output, and generated document — is released from memory when the serverless function completes.
No document, text, or output is retained anywhere on our infrastructure after delivery. We cannot retrieve your document after the session because it no longer exists.

Exactly What We Store — And What We Don't

What We DO Store

  • Your email address (for authentication)
  • Your subscription tier and billing status
  • Usage count (number of packages generated)
  • Account creation date
  • Stripe customer ID (for billing management)

What We Do NOT Store

  • Your uploaded documents — never
  • Extracted text from your documents — never
  • AI-generated audit content — never
  • Your generated audit packages — never
  • Document metadata (file names, sizes) — never
  • Any record of what you uploaded — never

If you delete your account, the only data that existed — your email, subscription status, and usage count — is removed. There are no document traces to delete because none were ever created.

Third-Party Services We Use

We believe you have a right to know every external service that touches your data and what their commitments are. Here is the complete list:

ServiceWhat It DoesWhat Data It SeesTheir Data Commitment
Anthropic (Claude API)Generates audit content from your document textThe extracted text from your uploaded document, during processing onlyAPI inputs are not used for model training. Inputs are not retained beyond processing duration.
VercelHosts the web application and serverless functionsEncrypted web traffic. Document data exists in serverless function memory during processing only.SOC 2 Type 2 certified. Data encrypted in transit (TLS 1.2+). Serverless function memory is released after execution.
SupabaseStores user accounts, subscription data, and the Regulatory Knowledge GraphYour email, subscription tier, and usage count. Does NOT see your uploaded documents.SOC 2 Type 2 certified. Row Level Security enforced. Data encrypted at rest and in transit. Americas region hosting.
StripeProcesses subscription paymentsYour payment information (card number, billing address). We never see or store your full card number.PCI DSS Level 1 certified (highest level). Payment data fully managed by Stripe.

What We Guarantee — And What We're Honest About

We respect you too much to bury important nuances in fine print. Here is an honest accounting of our security posture:

What We Can Confirm

  • All data in transit is encrypted via TLS 1.2 or higher between your browser and our servers, and between our servers and third-party APIs.
  • Your documents are never stored on any database, file system, or persistent storage on our infrastructure. Processing is entirely in-memory.
  • Your documents cannot be retrieved after processing because they no longer exist. There is nothing to retrieve, export, or leak after the session completes.
  • Authentication uses industry-standard protocols via Supabase Auth with secure session management.
  • Payment processing is handled entirely by Stripe, a PCI DSS Level 1 certified processor. We never see or store your full card number.
  • Our hosting provider (Vercel) and database provider (Supabase) are both SOC 2 Type 2 certified.

What We Are Transparent About

  • Your document text is transmitted to Anthropic's Claude API for AI processing. This means the text content of your document passes through Anthropic's infrastructure during generation. Anthropic's API terms state that inputs are not used for training and are not retained. However, we want you to know this step exists.
  • GxP Prep AI as an entity does not hold its own SOC 2 certification. Our infrastructure providers (Vercel and Supabase) are SOC 2 certified. We rely on their certified infrastructure for our security posture.
  • We are a pre-launch startup. Our security practices are described accurately above, but we have not yet completed a third-party penetration test or independent security audit. As the platform matures, we will pursue these and update this page accordingly.

We would rather tell you exactly where we are today than make claims we can't back up. This page will be updated as our security posture evolves.

If Your Client Asks How Their Data Is Protected

We know that some of you will need to explain your use of this platform to sponsors or clients who are sensitive about document confidentiality. Here is a summary you can share or adapt:

GxP Prep AI processes uploaded study documents entirely in server memory. No documents are stored on any database or file system at any point. The document text is transmitted to Anthropic's API (encrypted via TLS) for AI processing; Anthropic's terms confirm they do not retain API inputs or use them for training. After the audit package is generated and delivered, all data associated with the session — the uploaded file, extracted text, and generated output — is released from memory. The platform's infrastructure providers (Vercel and Supabase) are SOC 2 Type 2 certified.

This summary is accurate as of March 2026. We will keep it updated and notify users of any changes to our data handling architecture.

What's Coming

We are actively working to strengthen our security posture. Here is what's planned:

TimelineInitiativeStatus
Before beta launchPrivacy policy published (GDPR/CCPA compliant)In progress
Before beta launchTerms of Service reviewed by attorney with AI liability specializationIn progress
Before beta launchAnthropic API data handling terms reviewed and documentedIn progress
Near-termErrors & Omissions (E&O) insurance obtainedPlanned
Near-termEvaluate Anthropic zero-data-retention API options if availablePlanned
Near-termClient-side document parsing to minimize text sent to APIUnder evaluation
Medium-termIndependent security audit / penetration testPlanned
Medium-termSOC 2 Type 2 readiness assessmentPlanned
Medium-termData Processing Agreement (DPA) for enterprise/EU customersPlanned

This roadmap will be updated as items are completed. We will notify all users when significant security milestones are reached.

Frequently Asked Questions

Can you access my uploaded documents after I download my audit package?

No. Your documents exist only in server memory during processing. Once your audit package is delivered and the serverless function completes, the memory is released. There is no database record, no file system copy, and no backup. We cannot retrieve your documents because they no longer exist anywhere.

Does Anthropic store or train on my document data?

Anthropic's API Terms of Service state that they do not use API inputs to train their models and do not retain inputs beyond the duration needed for processing. We encourage you to review Anthropic's API terms directly for the most current commitments.

Is my data encrypted?

Yes. All data in transit between your browser and our servers, and between our servers and third-party APIs, is encrypted via TLS 1.2 or higher. Your account data stored in Supabase is encrypted at rest. Payment data is managed entirely by Stripe, which is PCI DSS Level 1 certified.

What if GxP Prep AI is hacked?

Because we do not store your documents, a breach of our infrastructure would not expose your study documents. An attacker accessing our database would find only email addresses, subscription tiers, and usage counts. They would not find any study documents, audit packages, or document-related data because none is stored.

Can I use this platform if my client's confidentiality agreement restricts data sharing?

You should review your specific confidentiality agreement. The key fact for your assessment: your document's text content is transmitted to Anthropic's API for processing (encrypted in transit, not retained per Anthropic's terms). If your agreement restricts sharing document content with third-party processors, you should consult with your client before using the platform. We are transparent about this rather than claiming it does not apply.

Do you have SOC 2 certification?

GxP Prep AI as an entity does not currently hold SOC 2 certification. Our infrastructure providers — Vercel (hosting) and Supabase (database and authentication) — are both SOC 2 Type 2 certified. SOC 2 readiness is on our security roadmap. We will update this page when we achieve it.

Will you notify me if your data handling practices change?

Yes. Any material changes to how we handle your data will be communicated to all users via email before the changes take effect. This page will always reflect our current practices.