AI benchmarks are broken. Here’s what we need instead.


Across the organizations where this approach has emerged and started to be applied, the first step is shifting the unit of analysis. 

For example, in one UK hospital system in the period 2021–2024, the question expanded from whether a medical AI application improves diagnostic accuracy to how the presence of AI within the hospital’s multidisciplinary teams affects not only accuracy but also coordination and deliberation. The hospital specifically assessed coordination and deliberation in human teams using and not using AI. Multiple stakeholders (within and outside the hospital) decided on metrics like how AI influences collective reasoning, whether it surfaces overlooked considerations, whether it strengthens or weakens coordination, and whether it changes established risk and compliance practices. 

This shift is fundamental. It matters a lot in high-stakes contexts where system-level effects matter more than task-level accuracy. It also matters for the economy. It may help recalibrate inflated expectations of sweeping productivity gains that are so far predicated largely on the promise of improving individual task performance. 

Once that foundation is set, HAIC benchmarking can begin to take on the element of time. 

Today’s benchmarks resemble school exams—one-off, standardized tests of accuracy. But real professional competence is assessed differently. Junior doctors and lawyers are evaluated continuously inside real workflows, under supervision, with feedback loops and accountability structures. Performance is judged over time and in a specific context, because competence is relational. If AI systems are meant to operate alongside professionals, their impact should be judged longitudinally, reflecting how performance unfolds over repeated interactions. 

I saw this aspect of HAIC applied in one of my humanitarian-sector case studies. Over 18 months, an AI system was evaluated within real workflows, with particular attention to how detectable its errors were—that is, how easily human teams could identify and correct them. This long-term “record of error detectability” meant the organizations involved could design and test context-specific guardrails to promote trust in the system, despite the inevitability of occasional AI mistakes.

A longer time horizon also makes visible the system-level consequences that short-term benchmarks miss. An AI application may outperform a single doctor on a narrow diagnostic task yet fail to improve multidisciplinary decision-making. Worse, it may introduce systemic distortions: anchoring teams too early in plausible but incomplete answers, adding to people’s  cognitive workloads, or generating downstream inefficiencies that offset any speed or efficiency gains at the point of the AI’s use. These knock-on effects—often invisible to current benchmarks—are central to understanding real impact. 

The HAIC approach, admittedly promises to make benchmarking more complex, resource-intensive, and harder to standardize. But continuing to evaluate AI in sanitized conditions detached from the world of work will leave us misunderstanding what it truly can and cannot do for us. To deploy AI responsibly in real-world settings, we must measure what actually matters: not just what a model can do alone, but what it enables—or undermines—when humans and teams in the real world work with it.

 Angela Aristidou is a professor at University College London and a faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. She speaks, writes, and advises about the real-life deployment of artificial-intelligence tools for public good.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


Google Maps has a long list of hidden (and sometimes, just underrated) features that help you navigate seamlessly. But I was not a big fan of using Google Maps for walking: that is, until I started using the right set of features that helped me navigate better.

Add layers to your map

See more information on the screen

Layers are an incredibly useful yet underrated feature that can be utilized for all modes of transport. These help add more details to your map beyond the default view, so you can plan your journey better.

To use layers, open your Google Maps app (Android, iPhone). Tap the layer icon on the upper right side (under your profile picture and nearby attractions options). You can switch your map type from default to satellite or terrain, and overlay your map with details, such as traffic, transit, biking, street view (perfect for walking), and 3D (Android)/raised buildings (iPhone) (for buildings). To turn off map details, go back to Layers and tap again on the details you want to disable.

In particular, adding a street view and 3D/raised buildings layer can help you gauge the terrain and get more information about the landscape, so you can avoid tricky paths and discover shortcuts.

Set up Live View

Just hold up your phone

A feature that can help you set out on walks with good navigation is Google Maps’ Live View. This lets you use augmented reality (AR) technology to see real-time navigation: beyond the directions you see on your map, you are able to see directions in your live view through your camera, overlaying instructions with your real view. This feature is very useful for travel and new areas, since it gives you navigational insights for walking that go beyond a 2D map.

To use Live View, search for a location on Google Maps, then tap “Directions.” Once the route appears, tap “Walk,” then tap “Live View” in the navigation options. You will be prompted to point your camera at things like buildings, stores, and signs around you, so Google Maps can analyze your surroundings and give you accurate directions.

Download maps offline

Google Maps without an internet connection

Whether you’re on a hiking trip in a low-connectivity area or want offline maps for your favorite walking destinations, having specific map routes downloaded can be a great help. Google Maps lets you download maps to your device while you’re connected to Wi-Fi or mobile data, and use them when your device is offline.

For Android, open Google Maps and search for a specific place or location. In the placesheet, swipe right, then tap More > Download offline map > Download. For iPhone, search for a location on Google Maps, then, at the bottom of your screen, tap the name or address of the place. Tap More > Download offline map > Download.

After you download an area, use Google Maps as you normally would. If you go offline, your offline maps will guide you to your destination as long as the entire route is within the offline map.

Enable Detailed Voice Guidance

Get better instructions

Voice guidance is a basic yet powerful navigation tool that can come in handy during walks in unfamiliar locations and can be used to ensure your journey is on the right path. To ensure guidance audio is enabled, go to your Google Maps profile (upper right corner), then tap Settings > Navigation > Sound and Voice. Here, tap “Unmute” on “Guidance Audio.”

Apart from this, you can also use Google Assistant to help you along your journey, asking questions about your destination, nearby sights, detours, additional stops, etc. To use this feature on iPhone, map a walking route to a destination, then tap the mic icon in the upper-right corner. For Android, you can also say “Hey Google” after mapping your destination to activate the assistant.

Voice guidance is handy for both new and old places, like when you’re running errands and need to navigate hands-free.

Add multiple stops

Keep your trip going

If you walk regularly to run errands, Google Maps has a simple yet effective feature that can help you plan your route in a better way. With Maps’ multiple stop feature, you can add several stops between your current and final destination to minimize any wasted time and unnecessary detours.

To add multiple stops on Google Maps, search for a destination, then tap “Directions.” Select the walking option, then click the three dots on top (next to “Your Location”), and tap “Edit Stops.” You can now add a stop by searching for it and tapping “Add Stop,” and swap the stops at your convenience. Repeat this process by tapping “Add Stops” until your route is complete, then tap “Start” to begin your journey.

You can add up to ten stops in a single route on both mobile and desktop, and use the journey for multiple modes (walking, driving, and cycling) except public transport and flights. I find this Google Maps feature to be an essential tool for travel to walkable cities, especially when I’m planning a route I am unfamiliar with.


More to discover

A new feature to keep an eye out for, especially if you use Google Maps for walking and cycling, is Google’s Gemini boost, which will allow you to navigate hands-free and get real-time information about your journey. This feature has been rolling out for both Android and iOS users.



Source link