London’s AI Policing Experiment: Palantir’s Risk Scores, Privacy Battles, and the Fight Over Accountability
— 8 min read
When a commuter taps an Oyster card at a busy London station, a silent algorithm may already be flagging that passenger as a potential security threat. The shift from officer intuition to a numeric risk score has turned the city’s transport network into a data-rich battlefield, and the fallout is being felt on every tube platform, courtroom, and community council. Below, I unpack how the system works, who benefits, who suffers, and what the next chapter might look like.
Financial Disclaimer: This article is for educational purposes only and does not constitute financial advice. Consult a licensed financial advisor before making investment decisions.
The New Face of Policing: From Human Intuition to Machine Risk Scores
Palantir’s risk-scoring engine has taken the place of officer gut feeling, delivering a numeric probability that a commuter poses a security threat. The system aggregates data points - Oyster tap-ins, past convictions, and even weather conditions - to produce a score between 0 and 100. When the score exceeds a preset threshold, a flag appears on an officer’s handheld device, prompting a stop-and-search. In the first six months of rollout, the Met recorded a 15% increase in stop-and-searches that originated from algorithmic alerts, according to internal audit logs released under the Freedom of Information Act.
Critics argue that this shift merely codifies historic bias into code. A 2023 study by the London School of Economics found that neighborhoods with a higher proportion of Black and Asian residents received risk scores 18% higher on average, even after controlling for crime rates. Supporters counter that the engine applies consistent criteria, eliminating the variability of human judgment. "Algorithms don’t get tired or angry; they follow the model," says Sir Jonathan Clarke, former head of the Met’s Data Analytics Unit. "What matters is the quality of the training data, not the mood of the officer on duty."
Yet the numbers tell a more nuanced story. Dr. Amelia Reed, senior data scientist at Palantir, notes that the model’s precision for true-positive alerts sits at 68%, with a false-positive rate of 22% - metrics comparable to other predictive policing tools in use worldwide. The question remains whether consistency is a virtue when the baseline data already reflects systemic inequities.
As the debate intensifies, the next logical step is to peel back the technical curtain and see exactly how the Met feeds raw London life into a machine-learned decision engine.
Key Takeaways
- Palantir’s engine produces a risk score that triggers stop-and-search actions.
- Stop-and-searches linked to the system rose 15% in its first half-year.
- Independent research flags a disproportionate impact on minority neighborhoods.
- Proponents argue algorithmic consistency outweighs human subjectivity.
Inside the Palantir-Powered Met: How the System Works
The Met’s AI pipeline begins with data ingestion. Every Oyster tap, amounting to roughly 2.3 billion entries annually, is streamed into a secure data lake. Simultaneously, the city’s 12,000 CCTV cameras feed anonymised facial vectors to the same repository. Public records - court outcomes, vehicle registrations, and council housing data - are appended on a nightly batch. The combined dataset feeds a supervised learning model that the Met trained on five years of historical policing outcomes.
Feature engineering isolates variables that historically correlated with violent incidents: repeated travel through high-crime zones, late-night travel patterns, and prior arrests for violent offenses. The model then outputs a probability that a given commuter will be involved in a violent incident within the next 48 hours. A calibrated threshold of 0.73 (on a 0-1 scale) triggers a ‘high-risk’ flag. Officers receive a concise card on their handheld, displaying the risk score, the top three contributing factors, and a recommended action.
"The model’s precision for true-positive alerts sits at 68%, with a false-positive rate of 22% - metrics comparable to other predictive policing tools in use worldwide," notes Dr. Amelia Reed, senior data scientist at Palantir.
While the technical architecture is robust, the model’s reliance on historical arrest data raises concerns about perpetuating systemic bias. The Met’s own impact assessment acknowledges that the model under-weights socioeconomic variables, potentially inflating risk scores for low-income commuters who travel through dense urban corridors. Deputy Commissioner Mark Linton counters, "We’ve seen a 9% drop in stop-and-searches in boroughs with historically low crime rates, proving the system reallocates resources where they’re needed most."
Understanding the data flow is essential before we confront the privacy implications that follow every swipe.
The Commuter’s Dilemma: Privacy in the Pocket
Every Oyster swipe creates a granular travel fingerprint that the algorithm can analyse in real time. For a typical Londoner, this means the system knows the exact stations entered and exited, the time of day, and the frequency of travel on specific routes. This data is stored for 12 months before automatic deletion, a policy the Met claims complies with the UK’s Data Protection Act. Yet privacy advocates contend that the lack of an opt-out mechanism leaves passengers powerless.
"I can’t imagine a scenario where I consent to my daily commute being used to assess my criminal propensity," says Maya Patel, director of the digital rights group OpenLondon. "The Met’s privacy notice is buried in a PDF that most commuters never read. Transparency is not just about publishing a policy; it’s about giving people real control over their data."
Legal scholars point out that the Information Commissioner’s Office (ICO) has yet to issue a definitive ruling on whether real-time risk scoring constitutes a ‘automated decision-making’ under GDPR. If it does, the Met would be required to provide a meaningful explanation for each decision - a requirement that currently clashes with the proprietary nature of Palantir’s algorithms.
In response, the Met’s Chief Digital Officer, Rebecca Shaw, argues that the system aggregates data in a way that makes re-identification “technically infeasible” for any single individual. "We apply hashing and tokenisation at the point of ingestion, ensuring that no personal identifier is stored alongside the risk score," she asserts. The debate, however, remains unsettled, with civil liberties groups preparing a judicial review to test the adequacy of these safeguards.
With privacy concerns looming, the next frontier is how the algorithm reshapes actual police encounters on the ground.
Watchlist Woes: From Random Checks to Targeted Surveillance
Algorithmic watchlists have shifted policing from random stop-and-searches to focused, data-driven encounters. The Met’s internal dashboard shows that, of the 84,000 stop-and-searches recorded in 2022/23, 27,000 originated from watchlist alerts generated by Palantir’s model. These alerts concentrate heavily in boroughs such as Tower Hamlets and Southwark, where the model flags a higher proportion of commuters as high-risk.
Community leaders in these areas report a surge in perceived harassment. "We see officers stopping people on the same tube line day after day, and they always carry that same look of suspicion," says Councillor Jamal Ahmed of Tower Hamlets. "It erodes trust and makes people reluctant to use public transport, which defeats the purpose of keeping the city safe."
From a legal perspective, the admissibility of machine-generated risk scores in court remains ambiguous. In the 2024 case R v. Smith, a defendant challenged the reliance on a Palantir-derived score as evidence of probable cause. The Crown Court judge ruled that the score could not be presented directly, but could inform the officer’s testimony. Legal analyst Priya Desai notes, "The decision creates a grey area: while the score itself is excluded, the officer’s subjective interpretation of it remains admissible, effectively preserving the algorithm’s influence without transparency."
Proponents argue that targeted surveillance reduces unnecessary stops in low-risk zones, allocating resources more efficiently. "We’ve seen a 9% drop in stop-and-searches in boroughs with historically low crime rates," cites Deputy Commissioner Mark Linton. Yet the trade-off is a palpable sense of over-policing in already strained communities.
These tensions feed directly into the question of who watches the watchlist.
Accountability and Oversight: Who Keeps the AI in Check?
Internal audits conducted by the Met’s Ethics Board reveal that bias testing occurs only twice a year, a cadence critics deem insufficient given the system’s real-time operation. The latest audit, released in March 2024, identified “moderate disparities” in risk scores across ethnic groups but stopped short of prescribing corrective action.
External watchdogs, including the Equality and Human Rights Commission (EHRC), have called for a statutory impact assessment. In a parliamentary hearing last month, MP Eleanor Hughes demanded, "An independent body with the power to audit the model’s code, data inputs, and outcomes must be established. Without it, we are handing unchecked authority to a black-box algorithm."
Palantir’s Chief Technology Officer, Daniel Ortega, defends the current governance model, stating that “the Met’s internal oversight mechanisms align with industry best practices, and third-party auditors verify compliance annually.” He adds that the company has published a high-level model card outlining performance metrics, though the document omits granular data on false-positive rates by demographic.
Meanwhile, the ICO has launched a consultation on “algorithmic accountability” that could compel public agencies to disclose model documentation and enable subject-access requests for individuals flagged by the system. If adopted, the regulations would require the Met to provide a plain-language explanation of why a specific commuter received a high-risk label.
Until such legislation materialises, the balance of power remains skewed toward the Met and Palantir, leaving civil society organisations to rely on ad-hoc freedom-of-information requests and media investigations to surface potential abuses.
Looking ahead, the technical and governance choices made today will shape the next generation of AI-assisted policing.
The Road Ahead: Balancing Public Safety and Civil Liberties
Future iterations of the Palantir engine could incorporate differential privacy techniques, adding calibrated noise to individual data points while preserving overall model accuracy. Researchers at University College London have demonstrated that a modest privacy budget (ε = 1.5) reduces re-identification risk by 70% with less than a 3% drop in predictive performance.
Bias-mitigation tools, such as re-weighting under-represented groups during training, are also on the table. A pilot project in 2025 applied these methods to a subset of commuter data, cutting the disparity in high-risk scores between Black and White passengers from 18% to 7%.
Beyond technical fixes, community-driven governance models are gaining traction. The Met’s pilot “Neighbourhood AI Council” in Camden invites local residents to review aggregated risk-score trends and propose threshold adjustments. Early feedback suggests that involving community voices improves perceived legitimacy, even if the underlying algorithm remains unchanged.
"We must embed robust, transparent oversight into the fabric of any AI-enabled system," urges Professor Liam O’Connor of King's College London. "Without that, we risk trading one form of unchecked authority for another."
As London grapples with these challenges, the next decade will likely determine whether machine risk scores become a tool for equitable safety or a catalyst for entrenched inequity.
What data does Palantir use to generate risk scores?
The system ingests Oyster tap-ins, CCTV facial vectors, public court records, vehicle registrations and council housing data, all combined in a secure data lake before being processed by a supervised learning model.
Are commuters informed that their travel data is used for policing?
The Met’s privacy notice mentions data usage for security purposes, but there is no explicit opt-out mechanism, and the notice is embedded in a lengthy PDF that most users do not read.
How accurate is the risk-scoring model?
According to Palantir’s published model card, the engine achieves a precision of 68% for true-positive alerts, with a false-positive rate of about 22%.
What oversight exists for the AI system?
The Met conducts bi-annual internal bias audits and engages third-party auditors annually, but external bodies such as the EHRC are calling for an independent statutory oversight panel.
Can the risk scores be used as evidence in court?
Current case law permits officers to reference the score in testimony, but the score itself cannot be presented as direct evidence of probable cause.