OpenAI's Deep Research System Card: Mitigating Risks for Web-Browsing AI

Security & Risk

The Engineer

26 Feb 2025 · 3 min read

OpenAI's new system card reveals how the company safeguards its Deep Research model during web browsing, addressing safety concerns and providing transparency for AI practitioners.

OpenAI has released a detailed system card outlining the safety measures and risk assessments conducted before launching their new agentic capability, Deep Research. This model is designed to perform multi-step research tasks on the internet, leveraging an early version of OpenAI's o3 architecture optimized for web browsing. Here’s what changed technically and why it matters to practitioners:

Technical Changes and Why They Matter

Model Capabilities:
- Web Browsing: Deep Research can search, interpret, and analyze massive amounts of text, images, and PDFs from the internet.
- User Data Analysis: It can read user-provided files and execute Python code to analyze data.
- Dynamic Reasoning: The model adapts its approach based on the information it encounters, making it highly flexible for complex tasks.
Safety Enhancements:
- Prompt Injections: Strengthened defenses against malicious instructions that could manipulate the model's behavior.
- Disallowed Content: Improved filters to prevent the generation or access of harmful content.
- Privacy Protections: Enhanced safeguards around personal information published online.
- Code Execution: Added layers of security to prevent unauthorized code execution.
- Bias and Hallucinations: Mitigated through better training data and model fine-tuning.

Preparedness Framework

OpenAI uses a Preparedness Framework to evaluate and mitigate risks. Here’s the scorecard for Deep Research:

CBRN (Chemical, Biological, Radiological, Nuclear): Medium
Cybersecurity: Medium
Persuasion: Medium
Model Autonomy: Medium

Scorecard Ratings

Low
Medium
High
Critical

Only models with a post-mitigation score of "medium" or below can be deployed. Models with a score of "high" or below can be developed further.

Introduction to Deep Research

Deep Research is an agentic capability that excels in multi-step research tasks on the internet. Powered by an early version of OpenAI o3, it uses reasoning to search, interpret, and analyze vast amounts of data from various sources. Key features include:

Web Browsing: Efficiently searches and interprets web content.
Data Analysis: Can read user-provided files and execute Python code for complex analysis.

OpenAI believes Deep Research will be valuable in a wide range of applications, from academic research to business intelligence.

Safety Testing and Mitigations

Before making Deep Research available to Pro users, OpenAI conducted rigorous safety testing:

External Red Teaming: Involved external experts to identify potential vulnerabilities.
Frontier Risk Evaluations: Assessed risks according to the Preparedness Framework.
New Mitigations:
- Strengthened privacy protections for personal information online.
- Trained the model to resist malicious instructions encountered during web searches.

Testing and Method Improvements

During safety testing, OpenAI identified opportunities to enhance their methods:

Human Probing: Conducted additional human-led probing to understand specific risks better.
Automated Testing: Implemented more automated tests for select risk areas.

These efforts ensured that Deep Research was thoroughly evaluated and improved before its launch.

Conclusion

OpenAI's Deep Research system card provides a transparent look at the safety measures and risk assessments conducted for this powerful new capability. By addressing key areas of concern through rigorous testing and mitigation, OpenAI aims to ensure that Deep Research is both effective and safe for users.