SOC: Security Operations Center Explained
Reviewed by Marcus Chen, CISSP
A SOC is a team organized to detect and respond to security events around the clock, costing $1.5 to $3 million annually for an in-house operation according to Ponemon Institute data. Most organizations under 1,000 employees outsource SOC functions through MDR or MSSP services because the staffing requirements for 24/7 coverage (minimum 8 to 12 analysts across tiers) exceed their security budgets. Understanding SOC structure and workflow explains why alert quality matters more than alert quantity and why understaffed SOCs create false confidence.
A Security Operations Center is people doing detection and response work. It sounds simple because it is, but organizations often misunderstand what a SOC actually requires and why it's expensive. Most companies don't build their own SOC because the cost and complexity are substantial. But understanding what a SOC does and how it works helps you understand what security monitoring actually requires—whether you're building a SOC in-house or outsourcing that work to a managed service provider. The structure and workflow of a SOC explains why threat detection doesn't just happen automatically and why alert quality matters more than alert quantity.
A SOC is fundamentally a team organized to detect and respond to security events. The team has different roles because the work ranges from routine to complex. You need people who can respond to alerts quickly and follow procedures. You also need people who can investigate sophisticated situations that don't fit standard playbooks. The organization of a SOC reflects this reality: some roles are entry-level and procedural, others require deep expertise. Some roles need to be staffed around the clock for organizations that operate continuously. Others are business-hours roles. The structure exists because not all security work is the same, and pretending it is creates disasters.
How a SOC is Staffed and Organized
A SOC uses a tiered structure where Tier 1 analysts handle routine alert triage, Tier 2 analysts investigate complex incidents, and Tier 3 analysts handle advanced threats and threat hunting. They monitor alerts coming in from detection tools, whether that's a SIEM, EDR platform, or other security system. When an alert appears, a Tier 1 analyst investigates: Is this a real security event, or is it a false positive? If it looks real, they gather information. What system was affected? What actions happened? Who else might be at risk? They document their findings and make a decision: close the alert as a false positive, handle a minor issue locally, or escalate to Tier 2 or incident response.
Tier 1 work is heavily procedural. There are playbooks for different alert types. "When you see alert type X, check for Y, then do Z." A trained analyst follows the playbook, gathers the right information, and makes the right decision. This is why Tier 1 is often entry-level in the security world. You can train someone to follow procedures and be effective. But Tier 1 work is also high-turnover because it's repetitive and can feel unrewarding after months of following the same playbooks.
Tier 2 analysts handle complex investigations. When an alert doesn't fit standard procedures or when Tier 1 escalates something complex, Tier 2 takes over. These analysts understand the threat landscape deeply. They can recognize attack patterns even when they don't match known signatures. They can investigate multi-system compromises and piece together what happened. Tier 2 requires experience and deep security knowledge. These analysts cost more and are harder to find.
Above Tier 2, there's usually a security architect or senior analyst who manages detection rules and tools. This person understands the SIEM or detection platform, sets up rules, tunes them based on what Tier 1 and Tier 2 are seeing, and works with the threat intelligence and security research community to stay current on emerging attacks. This is high-expertise work.
There's also usually an incident response coordinator or manager. When something serious is detected—an actual intrusion, not a false positive—incident response needs to take over. The coordinator makes sure the right people are notified, that the investigation stays organized, and that response actions are coordinated.
Shifts are staffed 24/7 in organizations that operate continuously or have global operations. A bank needs SOC coverage during every hour the markets are open. A healthcare organization needs SOC coverage around the clock because hospitals never close. A software company might staff SOC during business hours for the primary office time zones and accept that some incidents during off-hours will be discovered by on-call staff. The shift structure varies by organization, but full-time operation requires significant staffing.
The Alert Investigation Workflow
Every alert follows a workflow from initial triage through investigation to resolution, and the quality of this workflow determines whether threats are caught in minutes or missed entirely.
When a detection rule triggers, an alert appears in the SOC's monitoring dashboard. A Tier 1 analyst sees the alert and starts the investigation workflow. The first question is always: Is this real? Most alerts are false positives—normal activity that triggered a rule. An analyst might see an alert for "privilege escalation attempt" that turns out to be a system administrator legitimately updating software. Or "unusual data access" that turns out to be a user accessing files they should access but from a new application.
The investigation process should follow procedures. There should be a checklist for the analyst: What system reported the alert? When did the activity happen? What user or process was involved? What action did they attempt? Has this alert appeared before from the same source? Are there other alerts from the same time window that might correlate? The analyst gathers this information and documents it. Good SOCs have investigation tools that make gathering information efficient rather than having analysts log into five different systems to hunt for evidence.
If the investigation suggests this is a false positive, the analyst closes the alert and potentially notes that the rule might need tuning. Over time, tuning based on what analysts see reduces false positive volume. If the investigation suggests there's something worth paying attention to but it's not critical, the analyst might handle it locally—contacting the user to confirm activity, making a configuration change, gathering evidence for the record.
If the investigation suggests something serious, the analyst escalates. Escalation should be defined clearly. Is this a potential intrusion? Is it data exfiltration? Is it account compromise? Different severity levels get different responses. Tier 2 or incident response takes ownership from there. The handoff process should be structured. The escalating analyst documents what they found and what they tried. The receiving analyst or incident responder doesn't start from scratch.
The workflow seems straightforward but organizations often stumble. Without clear procedures, analysts make inconsistent decisions. Some real incidents get dismissed as false positives. Some false positives get escalated unnecessarily and waste incident response resources. Documentation matters enormously for consistency. Alert quality—meaning the percentage of escalated alerts that are actually real incidents—is a direct reflection of how well the investigation process is defined and followed.
Metrics That Actually Reflect SOC Effectiveness
Mean time to detect and mean time to respond are the two metrics that actually measure SOC effectiveness, not alert volume or dashboard counts that look impressive but reveal nothing about security outcomes.
How do you measure whether a SOC is working? The instinct is to count "number of threats detected," but this is hard to interpret. Did the SOC detect an actual intrusion, or is it just saying they detected alert noise? Did the detection actually help, or would the organization have discovered the incident some other way?
Better metrics focus on process rather than outcomes. Mean time to detect, or MTTD, measures how long from when a security event happens to when the SOC detects it. A system is compromised at 3 AM. The SOC's detection tool identifies the compromise at 3:15 AM. MTTD is 15 minutes. This metric is measurable and meaningful. Improving MTTD suggests better detection rules or better integration of data sources.
Mean time to respond, or MTTR, measures how long from when the SOC detects something to when incident response starts addressing it. The SOC detects a compromise at 3:15 AM. The incident response team receives the escalation at 3:20 AM and starts investigating. MTTR is 5 minutes. This metric reflects alert quality and escalation procedure efficiency. Improving MTTR suggests clearer escalation procedures or better staffing.
Alert accuracy rate measures what percentage of escalated alerts are real incidents versus false positives. If the SOC escalates 100 alerts per month and 85 of them turn out to be real security events, alert accuracy is 85%. This metric drives tuning—the goal is increasing accuracy by reducing false positives without missing real threats.
Coverage measures what percentage of the environment is monitored. If you have 2,000 endpoints and 1,800 are reporting to the SOC, coverage is 90%. If you have 50 servers and 48 have logging enabled, coverage is 96%. Coverage gaps are blind spots. Organizations should understand what systems they're monitoring and what blind spots exist.
These process metrics are measurable and actionable. They help drive improvement. But many organizations don't measure these things systematically. They can't describe their SOC's MTTD or MTTR. They don't know their false positive rate. They can't say what percentage of their infrastructure is covered by monitoring. This creates a measurement gap where it's hard to know whether the SOC is providing value or just consuming budget.
SOC Maturity: Progression and Investment
SOC maturity progresses from reactive alert handling to proactive threat hunting over years of investment, and each stage requires different staffing levels, tools, and processes.
The concept of SOC maturity describes how evolved a security operations center is. A level 1 SOC is reactive—analysts respond to alerts as they come in, following basic procedures. A level 2 SOC has more formalized procedures, playbooks, and training. A level 3 SOC has sophisticated detections and threat hunting—analysts actively searching for threats rather than just responding to alerts. A level 4 SOC adds automation and machine learning—systems automatically triaging alerts and recommending responses. A level 5 SOC is mature and continuously improving—regular assessment of detection effectiveness, feedback loops improving detection over time.
In practice, most organizations start at level 1 or 2. Moving to level 3 requires analysts with deeper expertise and more sophisticated detection infrastructure. Moving beyond that requires specialized talent and tools that are expensive and hard to maintain. Many organizations plateau at level 2 because the investment needed to reach level 3 and beyond is substantial and the skill set required is specialized.
Maturity level also correlates with cost. A level 1 or 2 SOC might cost $300K-$500K per year for an organization of moderate size. A level 3 SOC might cost double that or more. A level 4 or 5 SOC is aspirational for most organizations—the budget and expertise required are significant.
The right maturity level for an organization depends on risk and budget. A small organization with limited risk exposure probably doesn't need a level 3 SOC. They might acceptably operate at level 2—responding to alerts well, following procedures, documenting findings. A financial institution or healthcare organization handling sensitive data might need level 3 or higher—actively hunting for threats, not just reacting to alerts.
Understanding maturity also helps when you're considering outsourcing. If an MSSP offers a level 1 SOC—just alert response—that's different from offering a level 3 SOC with threat hunting. The cost and value are different. Knowing what maturity level you need helps you evaluate whether what you're buying matches what you need.
Internal SOC Versus Outsourced Services
An internal SOC gives you full control and deeper organizational knowledge, while outsourced services provide 24/7 coverage at lower cost, and the right choice depends on your budget, team size, and how much control you need over the detection and response process.
Organizations can build and operate their own SOC or outsource monitoring to a managed security services provider, often called an MSSP, or use a managed detection and response service, called MDR. Each approach has tradeoffs.
An internal SOC gives you control. You define the procedures. Analysts understand your environment and business. You own the data and the detection logic. You're not dependent on a vendor's update cycle or priorities. But you pay for all of it: recruiting and retaining talented analysts in a competitive market, building infrastructure, training staff, maintaining tools. A Tier 1 analyst costs roughly $60K-$80K per year including benefits. You need multiple analysts for 24/7 coverage. Add infrastructure, tools, and management overhead, and an internal SOC easily costs $500K-$1M+ per year for a mid-sized organization.
Outsourced services, whether MSSP or MDR, shift the operational burden to a vendor. You pay a subscription fee, typically per endpoint or per-user. The service provider handles staffing, tool maintenance, and updates. You don't have to recruit security talent or manage training. You trade control for simplicity. The tradeoff is that service provider analysts might know less about your specific environment. The procedures and tools are standardized, not customized to your risk profile. You're dependent on the service provider's quality and responsiveness.
The right choice depends heavily on organization size and sophistication. A 50-person company almost always benefits from outsourcing. Building even a basic internal SOC is not cost-justified for that size. A 1,000-person company with sophisticated IT infrastructure and significant risk might benefit from an internal SOC—they can justify the cost and they have complex needs that standardized service might not address. A 200-person company is often best served by outsourced services as a stepping stone toward internal capabilities as they grow.
Cost is often the deciding factor. A rough rule: if you'd need two or more full-time analysts for internal SOC coverage, outsourced service often costs less and has lower operational burden. If you'd need five or more analysts, internal SOC might cost less on a per-analyst basis. But cost isn't the only factor. Sophistication, control, and understanding of your environment matter too.
Connecting Alert Quality to Business Value
Alert quality directly determines whether the SOC catches real threats or drowns in false positives, and investing in detection rule tuning delivers more security value than adding more data sources.
The gap between SOC effectiveness and what executives expect is often substantial. An executive might think "we have a SOC, so we're protected from intrusions." The reality is more measured. A SOC reduces the time between when something goes wrong and when you're aware of it. That's valuable. But a SOC doesn't guarantee you won't have an intrusion. It increases the likelihood you'll catch it and respond before damage is catastrophic.
Alert quality determines whether the SOC is useful or just expensive. If alert quality is poor—lots of false positives, missed real events—the SOC creates noise without value. If alert quality is good—escalated alerts are real events, real events get caught—the SOC provides measurable value. The quality depends on how well the detection rules are tuned, which depends on how much expertise and effort goes into the SOC.
This is why SOC is a people business first and a tool business second. The tools—SIEM, EDR, etc.—enable the work, but the quality comes from the people. Hiring good analysts, training them well, giving them good procedures to follow, and then measuring the output to drive continuous improvement—that's what separates a SOC that provides value from a SOC that just consumes budget.
Frequently Asked Questions
How many analysts does a SOC need?
For 24/7 coverage, you need a minimum of 8 to 12 analysts across three shifts, plus Tier 2 and Tier 3 specialists. The Ponemon Institute estimates that an effective in-house SOC requires at least $1.5 million in annual staffing costs before accounting for technology.
What is the difference between a SOC and an MDR service?
A SOC is the team and facility that performs detection and response. MDR is an outsourced version of that function. An internal SOC gives you deeper organizational knowledge and full control. MDR gives you 24/7 coverage at a fraction of the cost. Most organizations under 1,000 employees choose MDR.
What are the most important SOC metrics?
Mean time to detect (MTTD) and mean time to respond (MTTR) are the two metrics that directly reflect security outcomes. Alert volume and dashboard counts are vanity metrics that reveal nothing about whether threats are actually being caught and stopped.
Can we build a part-time SOC?
A business-hours-only SOC leaves you unmonitored during 128 of 168 hours per week. Most attacks do not time themselves to business hours. If 24/7 coverage is not feasible internally, outsource after-hours monitoring to an MDR or MSSP and retain business-hours coverage internally.
Fully Compliance provides educational content about IT compliance and cybersecurity. This article reflects current perspectives on SOC operations and security monitoring as of its publication date. SOC structures and best practices evolve — consult qualified security professionals for guidance specific to your organization's risk profile and operational needs.