ArXiv introduces one-year ban for researchers who submit papers with unchecked AI-generated content



TL;DR

ArXiv will ban researchers for one year if they submit papers with obvious signs of unchecked AI generation, such as hallucinated references or leftover chatbot instructions. The policy, announced by computer science section chair Thomas Dietterich, is the first formal penalty by a major preprint platform for AI-generated slop.

 

ArXiv, the open-access repository that has served as the primary distribution channel for preprint research in computer science, mathematics, and physics for more than three decades, will ban authors for one year if they submit papers containing obvious signs of unchecked AI generation. Thomas Dietterich, chair of arXiv’s computer science section, announced the policy on Thursday, writing that submissions with “incontrovertible evidence” of unvetted large language model output mean “we can’t trust anything in the paper.”

The rule is not a blanket prohibition on using AI tools. Researchers can still use language models for drafting, editing, or analysis. What triggers the penalty is evidence that an author pasted LLM output into a paper without checking it, the kind of carelessness that produces hallucinated references, placeholder instructions from the chatbot, or fabricated data tables with notes reading “fill in with the real numbers from your experiments.” If moderators find such evidence and a section chair confirms it, the author faces a one-year ban from arXiv, after which all subsequent submissions must first be accepted by a peer-reviewed journal before they can appear on the platform.

Why it matters

ArXiv is not a journal. It does not peer-review papers. But it has become the de facto way that research circulates in several of the fastest-moving fields in science, particularly machine learning and artificial intelligence. Papers posted to arXiv are read, cited, and built upon long before they appear in formal publications, if they ever do. That makes the platform’s quality standards unusually consequential: a hallucinated citation on arXiv can propagate through the research literature just as effectively as one in a peer-reviewed journal, and often faster.

The scale of the problem is significant. A study published in The Lancet in May 2026 by researchers at Columbia University audited 2.5 million biomedical papers and 126 million references indexed on PubMed Central. It found that fabricated citations have risen twelvefold since 2023. In that year, roughly one in 2,828 papers contained at least one fake reference. By 2025, the rate had climbed to one in 458. In the first seven weeks of 2026, it was one in 277. The researchers attributed the surge to the proliferation of AI writing tools, noting that previous studies estimate 30 to 69 per cent of LLM-generated references in biomedical contexts are fabricated.

ArXiv has reason to take the threat seriously. The platform receives thousands of submissions each month, and its volunteer moderation system was not designed to screen for machine-generated content at scale. Dietterich’s announcement described the new penalty as a “one-strike” rule, though decisions are subject to appeal and require confirmation by a section chair before being imposed.

What counts as evidence

The policy is deliberately narrow in what it targets. Dietterich listed specific examples of “incontrovertible evidence”: hallucinated references that do not correspond to any real publication, meta-comments from the language model left in the text (such as “here is a 200-word summary; would you like me to make any changes?”), and placeholder data with instructions to the author that were never removed. These are not subtle quality failures. They are signs that the author did not read the paper before submitting it.

The distinction matters because it avoids the far more difficult question of whether AI-assisted writing should be permitted at all. ArXiv’s existing policy already states that authors bear “full responsibility” for their content “irrespective of how the contents are generated.” The new penalty enforces that principle by targeting the most egregious violations, cases where the author’s failure to exercise any oversight is provable from the text itself.

That approach has practical advantages. Detecting whether a well-edited paper was drafted with the help of an LLM is unreliable with current detection tools, and attempting to enforce a broader ban would be both technically difficult and potentially punitive toward researchers who use AI tools responsibly. By focusing on obvious slop, arXiv can enforce the rule without needing to build or buy an AI-detection system, a technology that remains prone to its own errors.

A broader problem

ArXiv is not the only institution struggling with the issue. Academic conferences in computer science, including NeurIPS and ICML, have reported surges in submissions that appear to be generated with minimal human oversight. Nature published a feature in late 2025 describing how AI slop is creating a crisis in computer science, where the volume of low-quality submissions is overwhelming reviewers and diluting the signal-to-noise ratio of the field’s output.

Peer-reviewed journals face the same problem. The Lancet study found that fabricated citations appeared in papers that had already passed peer review, suggesting that reviewers are either not checking references or are unable to identify fabrications at the rate they are now appearing. Lead author Maxim Topaz, of Columbia University’s School of Nursing, warned that clinicians and guideline developers have no way of knowing when the evidence they rely on does not exist, a gap that efforts to reduce AI hallucinations in scientific research have not yet closed.

ArXiv itself is undergoing structural changes that may help it address the challenge. After more than 20 years as a project hosted by Cornell University, the platform is becoming an independent nonprofit, a move that should give it greater autonomy over its moderation policies and the ability to raise funds specifically to combat quality problems. It has also introduced a requirement for first-time submitters to obtain an endorsement from an established author, a gatekeeping measure aimed at reducing the volume of submissions from accounts created solely to publish AI-generated material.

The limits of enforcement

The new rule will catch the most careless offenders, researchers who submit papers they have not read. It will not catch researchers who use language models to generate plausible but incorrect claims, fabricate data, or produce papers that are fluent but scientifically vacuous. Those problems require peer review, institutional oversight, and a willingness within the research community to treat AI-assisted misconduct with the same seriousness as traditional forms of fabrication.

What arXiv’s policy does establish is a principle: if you submit a paper, you are responsible for every word in it. That has always been true in theory. The difference now is that language models have made it trivially easy to produce text that reads like science but contains nothing of substance. ArXiv’s one-year ban is a modest penalty for a serious offence, but it is also the first formal acknowledgement by a major research platform that the problem is no longer one of occasional carelessness. It is structural, it is growing, and it requires dedicated infrastructure to combat.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


The first computer my family owned was an 80286 IBM clone, and it had lots of ports, none of which looked the same. There was a big 5-pin DIN for the keyboard, a serial port, a parallel port, a game port for our joystick, and of course, the VGA port for the monitor.

In comparison, a modern computer has much less diversity in the port department. Not only are there fewer types of ports, but the total number may be quite low as well. When we move to modern laptops, it can be much more minimalist. Some laptops have just a single port on the entire machine! Is this a bad thing? As with anything, the extremes are rarely ideal, but I’d say overall, this has been a pretty positive development for PCs.

The port explosion era was never sustainable

It was more like a port infection

You see, the reason we had so many ports for so long is that people kept inventing new interfaces to make up for the shortcomings of existing ones. However, instead of the newer, better interfaces making the old ones obsolete, they just became additive as perfectly summarized in this classic XKCD comic.

A comic illustrates how competing standards multiply: first showing 14 competing standards, then people agreeing to create one universal standard, followed by a final panel showing there are now 15 competing standards. Credit: Randall Munroe (CC-BY-NC)

In laptops, the need for so many ports reached ridiculous heights. In this video posted by X user PC Philanthropy, you can see his Sager/Clevo D9T absolutely packed with all the trimmings leading to a rather massive laptop.

It is undeniably a cool machine, but obviously goes against the principle of portable computing. Also, every port you install means power and space that could have been taken up by something else. That’s true for laptops and desktops.



















Quiz
8 Questions · Test Your Knowledge

PC ports and motherboard I/O
Trivia challenge

Think you know your USB from your PCIe? Put your connector knowledge to the test.

PortsStandardsHardwareConnectorsMotherboards

Which USB connector type is fully reversible, meaning it can be plugged in either way?

Correct! USB Type-C features a symmetrical oval design that lets you insert it in either orientation. Introduced in 2014, it has become the dominant connector for modern devices and supports everything from data transfer to video output and fast charging.

Not quite — the answer is USB Type-C. The older USB Type-A connector (the flat rectangular one) famously required you to flip it at least twice before getting it right. USB Type-C’s reversible design was one of its biggest selling points when it launched in 2014.

What does the ‘x16’ in a PCIe x16 slot refer to?

Exactly right! PCIe x16 means the slot has 16 data lanes, allowing significantly more bandwidth than smaller x1 or x4 slots. This is why discrete graphics cards almost always use x16 slots — they need that extra throughput to feed pixel data to your display.

Not quite — the ‘x16’ refers to the number of data lanes. More lanes mean more simultaneous data paths between the CPU and the card. Graphics cards use x16 slots because their massive data demands require all 16 of those lanes working together.

Which port on a motherboard is most commonly used to connect a display directly to the CPU’s integrated graphics?

That’s correct! The HDMI and DisplayPort connectors found on a motherboard’s rear I/O panel are wired directly to the CPU’s integrated graphics unit. If you have a discrete GPU installed, you should use that card’s outputs instead for best performance.

The right answer is the HDMI or DisplayPort connectors on the rear I/O panel. These ports bypass the discrete GPU entirely and tap into the CPU’s built-in graphics. It’s a common troubleshooting trap — plugging a monitor into the motherboard instead of the GPU and wondering why nothing works.

What is the primary function of the 24-pin ATX connector on a motherboard?

Spot on! The 24-pin ATX connector is the main power connector that delivers multiple voltage rails — including 3.3V, 5V, and 12V — from the power supply to the motherboard. Without it seated properly, your PC simply won’t power on at all.

The correct answer is delivering power from the PSU to the motherboard. The 24-pin ATX connector is the big wide plug you’ll find on every modern motherboard. It supplies several different voltage levels that the board distributes to components. PCIe cards get their supplemental power from separate 6- or 8-pin connectors directly from the PSU.

Which of the following rear I/O ports transmits both audio and video in a single cable and is most commonly found on modern motherboards?

Correct! HDMI carries both high-definition audio and video over a single cable, making it one of the most convenient display connectors available. It became standard on motherboards as integrated graphics improved, and modern versions support 4K and even 8K resolutions.

The answer is HDMI. VGA is analog-only and carries no audio, DVI-D is digital video only without audio, and S-Video is an older analog format. HDMI bundles both audio and video digitally, which is why it became the go-to connector for TVs, monitors, and motherboard rear panels alike.

What maximum theoretical data transfer speed does USB 3.2 Gen 2×2 support?

Impressive! USB 3.2 Gen 2×2 achieves 20 Gbps by using two 10 Gbps lanes simultaneously — that’s what the ‘2×2’ means. It requires a USB Type-C connector and is most commonly found on high-end motherboards, making it ideal for fast external SSDs.

The correct answer is 20 Gbps. The ‘2×2’ in the name is the key clue — it bonds two 10 Gbps channels together. USB naming got notoriously confusing around this era, with the same physical port potentially supporting very different speeds depending on the generation label printed in the spec sheet.

What is the role of the M.2 slot found on most modern motherboards?

Well done! M.2 is a compact form-factor slot that most commonly hosts NVMe SSDs, which connect via PCIe lanes for blazing-fast storage speeds. Some M.2 slots also support SATA-based SSDs and Wi-Fi/Bluetooth combo cards, making the slot surprisingly versatile.

The correct answer is housing compact storage drives or wireless cards. M.2 replaced the older mSATA standard and supports both PCIe NVMe drives and SATA drives depending on the slot’s keying. NVMe M.2 drives can achieve sequential read speeds many times faster than traditional SATA SSDs.

Which audio connector color on a standard PC rear I/O panel is designated for the main stereo line output to speakers or headphones?

That’s right! The green 3.5mm jack is the standard line-out port used for speakers and headphones in the PC audio color-coding scheme. Blue is line-in for recording, and pink is the microphone input — a color system that’s been consistent across PC motherboards for decades.

The correct answer is green. PC audio jacks follow a long-standing color convention: green for headphones and speakers, blue for line-in (recording from external sources), and pink for the microphone. It’s one of those legacy standards that has quietly persisted even as USB and digital audio have become more common.

Challenge Complete

Your Score

/ 8

Thanks for playing!

USB-C (almost) solved the problem

So close, but not quite there yet

Released to the public in the mid ’90s, USB came to the rescue. The “U” is for “Universal” and for the most part USB has lived up to that promise. Now there was one port that handled data and power. More importantly, USB is fully backwards compatible. So if you plug a USB 1.1 device into a modern USB port, it should work. Whether you can get software drivers for it is another story, but it will talk to the host device.

USB-C has proven to be less universal than I’d like, and the situation is still far better than it used to be. A single USB-C port on one of my laptops can act as a video output for just about anything, even an old VGA monitor.

A Macbook, CRT monitor, and iPad connected together. Credit: Sydney Louw Butler/How-To Geek

My smaller laptops don’t need special chargers anymore, and the latest laptops can pull 240W over USB-C, which is enough for all but the beefiest desktop replacement machines. There is no type of peripheral I can think of that doesn’t give you the option to use it over USB.

But the complaints aren’t so much that we only get USB these days, it’s more that we get so little of it.

Minimal I/O enables better hardware design

Harder, better, faster, stronger

When you only put a handful of USB-C ports on a mobile computer, you reap numerous benefits. The low profile of USB-C means the laptop can be thinner, and the frame can be a stronger and more rigid unibody design. Internally, you have room for more battery, larger performance components, or better cooling.

A green Apple MacBook Neo on display on a wooden table with a product sign behind it. Credit: Patrick Campanale / How-To Geek

It also means the internals can be simpler, and cheaper to design and fabricate, though whether those savings are passed on to customers is another story altogether.

Wireless and cloud-first workflows reduce physical dependency

I guess they are “air” ports

Perhaps the first sign of major change was when smartphones dropped headphone jacks, but the fact is that wireless technologies are now good enough for most peripheral and data connections. So, there’s no need to connect them directly to a port on a computer. Which, in turn, means that there’s no reason to have as many ports on the computer in the first place.

I can’t remember the last time I used a wired mouse or keyboard, and I only use Ethernet for devices that need extremely high speeds, low latency, or improved reliability. For normal day-to-day use, modern Wi-Fi is just fine. So while your laptop might not have as many wired ports on the outside, those wireless chips on the inside still give it numerous connectivity options for audio, input, and data transfer.

You could even make the same argument about storage to some extent, with many thin and light systems leaning on cloud storage to make up for a lack of ports to connect external storage.

MacBook Neo colors on a white background.

Operating System

macOS

CPU

A18 Pro

The MacBook Neo with the A18 Pro chip is Apple’s most affordable laptop yet, with all-day battery life and buttery-smooth performance in a thin and light profile.



The dongle backlash misses the bigger picture

The last bit of the port protest centers around dongles, but I never understood the complaints. Having one port that can be broken out into whatever ports you need using a little box is amazing. It makes ports optional and gives you the choice. If you never plug your laptop into anything, why deal with all the ports you’ll never use?

Likewise, if you only ever use ports with your laptop when you dock it at a desk, then you can just leave your dongle ready to go on your desk, but throwing a small dongle in your laptop sleeve or bag in case you might need it is a small price to pay for all the benefits of minimal IO.



Source link