• Email Us: [email protected]
  • Contact Us: +1 718 874 1545
  • Skip to main content
  • Skip to primary sidebar

Medical Market Report

  • Home
  • All Reports
  • About Us
  • Contact Us

Scientists Have Discovered A New Way To Count (And It’s Actually Really Important)

May 22, 2024 by Deborah Bloomfield

Science is great for innovation and improving our lives, but let’s face it: there are some things we’ve pretty much got down pat. You wouldn’t expect, for example, that we could improve on something like… like counting.

So it may come as a surprise that a group of computer scientists have done just that: found a new way to solve a decades-old problem that asks what, on the face of it, looks to be a very simple problem – how many distinct things are there in front of me?

Advertisement

It’s a harder problem – and a smarter solution – than you might think.

The Distinct Elements Problem

Computers can be very smart, but they can also be very, very… not-smart. Just look at the recent explosion of AI chatbots for evidence of that: they’re great at sounding intelligent, but put ‘em to the test and you might just find yourself in an ouroboros of bullshit.

And sometimes, it’s the things that seem almost laughably simple to a human that cause the most trouble. Take counting, for example – specifically, counting distinct objects. For us, it’s easy: we look at the collection of objects, and our brain just kind of automatically sorts them into groups for us. We barely have to work at it at all.

For computers, on the other hand, it’s a fundamental and decades-old problem. And it’s one that really needs to be answered, since its applications in the modern world span everything from network traffic analysis – think Facebook or Twitter monitoring how many people are logged in at any given time – to fraud detection, to bioinformatics, to text analysis, and much more.

Advertisement

Now, obviously, we’ve been able to do those things for a while now, and that’s because this counting question – properly known as the Distinct Elements Problem – does have answers. They’re just not very good ones. 

“Earlier known algorithms all were ‘hashing based,’ and the quality of that algorithm depended on the quality of hash functions that algorithm chooses,” explained Vinodchandran Variyam, a professor in the University of Nebraska–Lincoln’s School of Computing, in a statement last year. 

But, together with colleagues Sourav Chakraborty of the Indian Statistical Institute and Kuldeep Meel of the University of Toronto, he discovered a way to massively simplify the problem: “The new algorithm only uses a sampling strategy, and quality analysis can be done using elementary techniques.”

How does it work?

The new method, since named the CVM algorithm in honor of its inventors, drastically reduces memory requirements – an important advantage in this modern age of big data – and it does so using a neat trick of probability theory. To illustrate the concept, consider the example studied by Variyam and his colleagues, as well as a recent article in Quanta Magazine: imagine you’re counting the number of unique words in Shakespeare’s Hamlet, but you have only enough memory to store 100 words at a time. 

Advertisement

First, you do the obvious: you record the first 100 unique words you come across. You’re now out of space – so you take a coin and flip it for each word. Heads, it stays; tails, you forget it.

At the end of this process, you’ll have around 50 unique words in your list. You restart the process from before – but this time, if you come to a word already on the list, you flip the coin again to see whether or not to delete it. Once you reach 100 words, you run through the list again, flipping a coin for each word and deleting or keeping it as prompted.

In round two, things are a tiny bit more complex: instead of one head to keep a word in the list, you’ll need two in a row – anything else, and it gets deleted. Similarly, in round three, you’ll need to get three heads in a row for it to stay; round four will need four in a row, and so on until you reach the end of Hamlet.

There’s method in the madness – and it’s a smart one, too. By working through the text like this, you’ve ensured that every word in your list had the same probability of being there: 1/2k, where k is the number of times you had to work through the list. So, let’s say it took you six rounds to get to the end of Hamlet, and you’re left with a list of 61 distinct words: you can then multiply 61 by 26 to get an estimate of the number of words.

Advertisement

We’ll save you opening your calculator app: the answer is 3,904 – and according to Variyam and co, the actual answer is 3,967 (yes, they counted.) If you have a memory that can store more than 100 words, the accuracy goes up further: with the ability to store 1,000 words, the algorithm estimates the answer as 3,964 – barely a rounding error already – and “of course,” Variyam told Quanta, “if the [memory] is so big that it fits all the words, then we can get 100 percent accuracy.”

A simple approach

So, it’s effective – but what makes the algorithm even more intriguing is its simplicity. “The new algorithm is astonishingly simple and easy to implement,” Andrew McGregor, a Professor in the College of Information and Computer Sciences at the University of Massachusetts, Amherst, told Quanta. “I wouldn’t be surprised if this became the default way the [distinct elements] problem is approached in practice.”

Indeed, since its posting in January 2023 – and barring a few minor quibbles and bugs in the meantime – the algorithm has attracted attention and admiration from many other computer scientists. That means that, while the paper detailing the algorithm has not been peer-reviewed in the official sense, it definitely has been reviewed by peers. Indeed, Donald Knuth, author of The Art of Computer Programming and so-called “father of the analysis of algorithms,” wrote a paper in praise of the algorithm back in May 2023: “ever since I saw it […] I’ve been unable to resist trying to explain the ideas to just about everybody I meet,” he commented.

Meanwhile, various teams – Chakraborty, Variyam, and Meel included – have spent the last year investigating and fine-tuning the algorithm. Some, Variyam said, are already teaching it in their computer science courses.

Advertisement

“We believe that this will be a mainstream algorithm that is taught in the first computer science course on algorithms in general and probabilistic algorithm in particular,” he said. Knuth agrees: “It’s wonderfully suited to teaching students who are learning the basics of computer science,” he wrote in his May paper. “I’m pretty sure that something like this will eventually become a standard textbook topic.”

So, how did such a breakthrough algorithm evade notice for so long? According to Variyam, it’s not as unlikely as it sounds.

“It is surprising that this simple algorithm had not been discovered earlier,” he said. “It is not uncommon in science that simplicity is missed for several years.”

The paper is posted on the ArXiv and appeared in Proceedings of the 30th Annual European Symposium on Algorithms (ESA 2022).

Deborah Bloomfield
Deborah Bloomfield

Related posts:

  1. Events leading up to the trial of Theranos founder Elizabeth Holmes
  2. “Man Of The Hole”: Last Known Member Of Uncontacted Amazon Tribe Has Died
  3. This Is What Cannabis Looks Like Under A Microscope – You Might Be Surprised
  4. Will Lake Mead Go Back To Normal In 2024?

Source Link: Scientists Have Discovered A New Way To Count (And It's Actually Really Important)

Filed Under: News

Primary Sidebar

  • Get Ready, Skywatchers: A “Dazzling” Total Lunar Eclipse Is Coming In 2025
  • How A Man Won The Lottery 14 Times Using Unbelievably Basic Math
  • What Are The Amazon’s “Flying Rivers”? And Why Every Single One Of Us Relies On Them
  • Curious New Microbe With Tiny Genome Toes The Line Between Cell And Virus
  • We’ve Just Found Out Where The World’s Longest-Living Vertebrate Has Its Babies
  • For The First Time, An Animal Has Been Shown Responding To Plant-Produced Sounds
  • Deep Ocean Currents Have “Weather” And Seasonal Changes That We’re Only Just Learning About
  • Stratus: What Are The Symptoms Of The Latest COVID-19 Subvariant To Spread Around The World?
  • In 1927, Henry Ford Tried To Build A Town In The Amazon And Things Went Very, Very Badly
  • Human Botfly: Say Hello To The Parasite That Would Love To Get Under Your Skin
  • Is The Weather Making Your Headache Worse?
  • “Zoning Out” Actually Helps You Learn? Data From Up To 90,000 Brain Cells Says So
  • Over Past 250,000 Years, Three Major Waves Of Human-Neanderthal Interbreeding Have Been Identified
  • Zebrafish “Catch” Yawns Just Like Us – We Might Need To Rethink Evolution To Account For That
  • 80,000-Year-Old Neanderthal Footprints Reveal How Children Hunted On Beaches
  • 5 Animals That Have Absolutely No Business Jumping (In Our Very Humble, Definitely Unbiased Opinion)
  • Polar Vortex Patterns Explain Winter Cold Snaps Against Background Warming Trend
  • Scientists Tracked An Olm For 2,569 Days And It Did Not Move An Inch
  • Look Out For “Fireballs”: The Best Meteor Shower Of 2025 Is About To Commence, According To NASA
  • Why Do Many Large Language Models Give The Same Answer To This “Random” Number Query?
  • Business
  • Health
  • News
  • Science
  • Technology
  • +1 718 874 1545
  • +91 78878 22626
  • [email protected]
Office Address
Prudour Pvt. Ltd. 420 Lexington Avenue Suite 300 New York City, NY 10170.

Powered by Prudour Network

Copyrights © 2025 · Medical Market Report. All Rights Reserved.

Go to mobile version