âšď¸ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | FAIL | download_stamp > now() - 6 MONTH | 6.9 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://medium.com/@365daysofcomputer/47-error-detecting-and-correcting-codes-590431271137 |
| Last Crawled | 2025-09-12 04:26:38 (6 months ago) |
| First Indexed | not set |
| HTTP Status Code | 200 |
| Meta Title | 47: Error-Detecting and Correcting Codes | 365 Days of Computer | Medium |
| Meta Description | Before we can store files, letâs make sure that the hardware they are saved on doesnât somehow break. |
| Meta Canonical | null |
| Boilerpipe Text | Before we can store files, letâs make sure that the hardware they are saved on doesnât somehow break. While non-volatile memory is pretty reliable, it, unfortunately, does not work perfectly. People die, tectonic plates shift, and MOSFETs leak. Itâs sad to know that computer data decays significantly faster than data saved in books, which can last upward of 100 years: Computer data tends to only last ten years without intervention. The obvious solution to avoid corrupted data is clearly to make sure this doesnât happen. (duh) One of the most common forms of data corruption is that of the bit-flip, a pesky, hard-to-pin-down issue whereby a single bit changes its magnetization within one of the wires of the computer, which can cause all sorts of problems, such as calling an operation that doesnât exist or causing a number to âmiscalculateâ. In fact, such a small error can make or break an entire computer program. Sadly, we cannot fully protect our data from such an event. The main causes of bit-flips are unexpected flowing electrons within a wire which do not occur so often but often enough that preventative measures must be put in place. But going down the chain of etiology one level deeper, unexpected flow can be caused by overheating, which modern computers check for and notify you (or even shut down to protect the hardware). The row hammer effect, a surprising security vulnerability, can also cause a bit-flip, although this has since been fixed in modern hardware. With smaller and smaller transistors, however, individual electrons have amplified effects, hence why the second law of thermodynamics comes into play and can mess things up for us. While many of these issues can be fixed to a certain degree, this still leaves cosmic rays, the main culprit for surprise computer crashes for which preventative measures are very expensive, e.g., inserting the computer in a Faraday cage. In other words, the degradation of memory happens all the time, especially with the increasing amount of data saved in hardware. We can ârefreshâ the memory, i.e., deposit more electrons that are already holding them, but this still doesnât correct all errors that do occur. So when an error does happen, and it inevitably will, how might we fix it? The glorious answer to this is what is known as codes . These come in two formats: error-detecting and error-correcting codes. The central idea is to save more data than we need in our hardware, which will give us information about the information we have saved. That is, we include extra bits within every single saved byte, kB, words, etc., which are used to detect/correct bit-flips. Coding theory , the subfield in computer science studying these codes, is a whole subfield of computer science that heavily relies on mathematics, with many applications, including mitigation of bit-flips, detection of data corruption, compression, and so on. Given how many different file formats and media through which data is transmitted (air, space, water, wires, etc.), errors can occur in a large number of different ways. Errors occurring in arithmetic or other calculations are most obviously a problem (e.g., your gipfeli now costs 304 Chf). Errors occurring in text might not cause problems for readability (rig~t?) even if they would be irritating at best, while errors in sound (such as Voice over IP (VoIP)) would easily stay undetected, and errors in images might be noticed but would be inconsequential. Given the number of such correcting codes, many of which have special benefits over others, it is perhaps best to mention the error correcting code as we go along and, for now, simply classify the types of error codes that exist. The most basic form of an error-detecting code would probably be the data that includes parity bits . These are extra bits that are saved in the hardware that refer to whether there is an even or odd number of 1s or 0s in the data. For example, if we were to save an extra bit for each cache line (which would mean weâd be redesigning the underlying hardware to accommodate the parity bit), one could calculate the value of that bit by counting all 1s in the other 8 bits and alternating the value of the parity bit for every â1â in those 8 bits. (0, 1, 0, 1,âŚ) Once the bit value has been saved, we can transmit it, keep it in a database for several years, etc.. When we want to access the data again, we have a simple method to check if an error may have occurred: we count the number of 1âs, as before, ensuring that the sums add up to the number weâve saved in our parity bits. Once the sum is ready, there are three options: The parity bit and the sum are the same: There are an even amount of errors (0, 2, 4,âŚ) The parity bit and the sum are different: There are an odd amount of errors (1,3,5,âŚ) If there are 0 errors, the sum and the parity bit will be the same. If one error occurs, our sum is inevitably different from the parity bit, which is how we know there is an error, while at the same time, we cannot correct it. If two bits are flipped (e.g., 0â1, 1â0), the final sum in the parity bit would be the same as the sum we computed, hence the error will not be detected. If more than one error occurs in a line of bits, our parity bit cannot be used to check for errors. This is why in the early days of computing, where errors often occurred, the parity bit would not necessarily always detect the errors. Hardware has since improved, and we are now at a point where errors never seem to happen. This is clearly not true: Bit-flips happen constantly. But itâs the series of improving error-detecting and correcting codes that make it seem like everything is working flawlessly. The most famous (family of) error correcting & detecting code(s) is(are) the Hamming codes , some of the first error correcting codes discovered, which remain to this day, a primer on how it is done. 3bluebrown, a famous mathematics Youtube channel, has created a fantastic intro to Hamming codes that explains the mechanism in greater detail than I could. But letâs nonetheless briefly sketch out the mechanism here to give you an idea of detecting and correcting data in text format. Suppose we have a grid of 16 bits, and letâs assume that of these bits, only 12 of them correspond to our data. The other 4 bits act as redundant bits, meaning that they donât actually hold any relevant data beyond error detection and correction. These bits are colored red above. Hamming codes on 16 bits. Now, each of these bits counts the number of zeros in specific regions of the grid, including the redundant bits, and saves only the even (0) or odd (1) information of the sum. This is done for each of the four redundant bits in different configurations, such as the first two rows, last two columns, two non-adjacent rows, etc., as shown. We can do this with larger grids, which would cause us to use more redundant bits. In other words, this method is scalable. Like with the parity bits, if a single error occurs, we can actually track it down using our redundant bits. Because 2â´ = 16, it means that there are 16 different configurations of 1s and 0s for our four redundant bits, meaning that, informationally speaking, we can store enough information about the information to detect an error. That is, we can narrow down the row where the error occurred with the first two bits: given that we have four rows, we can specify the row with two bits (00, 01, 10, 11). Then with the other two bits, we can narrow down the column in which the error occurred in the same way, and from there, flip it back to the position we now know what it used to be. The code we have described here is more specifically known as a (16,12)-Hamming code, referring to the amount of total data in the block and how many of these bits comprise data. This would mean that we are saving a huge number of error-detecting bits on the hardware level for each byte of data. But it works, and we donât have to worry as much about data corruption as before. These are only the two most basic examples of error-correcting codes, and many more are increasingly sophisticated. Without them, even small parts of our computer would not work properly, hence why their invention in 1947 (around the time that proper software was being written (instead of digital logic being implemented) makes historical sense and also showcases how important this technology is for the functioning of even the simplest calculations. |
| Markdown | [Sitemap](https://medium.com/sitemap/sitemap.xml)
[Open in app](https://rsci.app.link/?%24canonical_url=https%3A%2F%2Fmedium.com%2Fp%2F590431271137&~feature=LoOpenInAppButton&~channel=ShowPostUnderUser&~stage=mobileNavBar&source=post_page---top_nav_layout_nav-----------------------------------------)
Sign up
[Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F%40365daysofcomputer%2F47-error-detecting-and-correcting-codes-590431271137&source=post_page---top_nav_layout_nav-----------------------global_nav------------------)
[Medium Logo](https://medium.com/?source=post_page---top_nav_layout_nav-----------------------------------------)
[Write](https://medium.com/m/signin?operation=register&redirect=https%3A%2F%2Fmedium.com%2Fnew-story&source=---top_nav_layout_nav-----------------------new_post_topnav------------------)
Sign up
[Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F%40365daysofcomputer%2F47-error-detecting-and-correcting-codes-590431271137&source=post_page---top_nav_layout_nav-----------------------global_nav------------------)

# 47: Error-Detecting and Correcting Codes
[](https://medium.com/@365daysofcomputer?source=post_page---byline--590431271137---------------------------------------)
[365 Days of Computer](https://medium.com/@365daysofcomputer?source=post_page---byline--590431271137---------------------------------------)
7 min read
¡
Nov 27, 2023
\--
Listen
Share
Press enter or click to view image in full size
![]()
Before we can store files, letâs make sure that the hardware they are saved on doesnât somehow break. While non-volatile memory is pretty reliable, it, unfortunately, does not work perfectly. People die, tectonic plates shift, and MOSFETs leak. Itâs sad to know that computer data decays significantly faster than data saved in books, which can last upward of 100 years: Computer data tends to only last ten years without intervention.
The obvious solution to avoid corrupted data is clearly to make sure this doesnât happen. (duh) One of the most common forms of data corruption is that of the bit-flip, a pesky, hard-to-pin-down issue whereby a single bit changes its magnetization within one of the wires of the computer, which can cause all sorts of problems, such as calling an operation that doesnât exist or causing a number to âmiscalculateâ. In fact, such a small error can make or break an entire computer program.
Sadly, we cannot fully protect our data from such an event. The main causes of bit-flips are unexpected flowing electrons within a wire which do not occur so often but often enough that preventative measures must be put in place. But going down the chain of etiology one level deeper, unexpected flow can be caused by overheating, which modern computers check for and notify you (or even shut down to protect the hardware). The row hammer effect, a surprising security vulnerability, can also cause a bit-flip, although this has since been fixed in modern hardware. With smaller and smaller transistors, however, individual electrons have amplified effects, hence why the second law of thermodynamics comes into play and can mess things up for us.
While many of these issues can be fixed to a certain degree, this still leaves cosmic rays, the main culprit for surprise computer crashes for which preventative measures are very expensive, e.g., inserting the computer in a Faraday cage. In other words, the degradation of memory happens all the time, especially with the increasing amount of data saved in hardware. We can ârefreshâ the memory, i.e., deposit more electrons that are already holding them, but this still doesnât correct all errors that do occur. So when an error does happen, and it inevitably will, how might we fix it?
The glorious answer to this is what is known as *codes*. These come in two formats: *error-detecting* and *error-correcting* codes. The central idea is to save more data than we need in our hardware, which will give us information about the information we have saved. That is, we include extra bits within every single saved byte, kB, words, etc., which are used to detect/correct bit-flips. *Coding theory*, the subfield in computer science studying these codes, is a whole subfield of computer science that heavily relies on mathematics, with many applications, including mitigation of bit-flips, detection of data corruption, compression, and so on.
Given how many different file formats and media through which data is transmitted (air, space, water, wires, etc.), errors can occur in a large number of different ways. Errors occurring in arithmetic or other calculations are most obviously a problem (e.g., your gipfeli now costs 304 Chf). Errors occurring in text might not cause problems for readability (rig~t?) even if they would be irritating at best, while errors in sound (such as Voice over IP (VoIP)) would easily stay undetected, and errors in images might be noticed but would be inconsequential.
Given the number of such correcting codes, many of which have special benefits over others, it is perhaps best to mention the error correcting code as we go along and, for now, simply classify the types of error codes that exist.
The most basic form of an error-detecting code would probably be the data that includes *parity bits*. These are extra bits that are saved in the hardware that refer to whether there is an even or odd number of 1s or 0s in the data. For example, if we were to save an extra bit for each cache line (which would mean weâd be redesigning the underlying hardware to accommodate the parity bit), one could calculate the value of that bit by counting all 1s in the other 8 bits and alternating the value of the parity bit for every â1â in those 8 bits. (0, 1, 0, 1,âŚ)
Once the bit value has been saved, we can transmit it, keep it in a database for several years, etc.. When we want to access the data again, we have a simple method to check if an error may have occurred: we count the number of 1âs, as before, ensuring that the sums add up to the number weâve saved in our parity bits. Once the sum is ready, there are three options:
1. The parity bit and the sum are the same: There are an even amount of errors (0, 2, 4,âŚ)
2. The parity bit and the sum are different: There are an odd amount of errors (1,3,5,âŚ)
If there are 0 errors, the sum and the parity bit will be the same. If one error occurs, our sum is inevitably different from the parity bit, which is how we know there is an error, while at the same time, we cannot correct it. If two bits are flipped (e.g., 0â1, 1â0), the final sum in the parity bit would be the same as the sum we computed, hence the error will not be detected. If more than one error occurs in a line of bits, our parity bit cannot be used to check for errors. This is why in the early days of computing, where errors often occurred, the parity bit would not necessarily always detect the errors.
Hardware has since improved, and we are now at a point where errors never *seem* to happen. This is clearly not true: Bit-flips happen constantly. But itâs the series of improving error-detecting **and correcting** codes that make it seem like everything is working flawlessly.
The most famous (family of) error correcting & detecting code(s) is(are) the *Hamming codes*, some of the first error correcting codes discovered, which remain to this day, a primer on how it is done. 3bluebrown, a famous mathematics Youtube channel, has created [a fantastic intro](https://www.youtube.com/watch?v=X8jsijhllIA) to Hamming codes that explains the mechanism in greater detail than I could. But letâs nonetheless briefly sketch out the mechanism here to give you an idea of detecting and correcting data in text format.
Suppose we have a grid of 16 bits, and letâs assume that of these bits, only 12 of them correspond to our data. The other 4 bits act as redundant bits, meaning that they donât actually hold any relevant data beyond error detection and correction. These bits are colored red above.
Press enter or click to view image in full size
![]()
Hamming codes on 16 bits.
Now, each of these bits counts the number of zeros in specific regions of the grid, including the redundant bits, and saves only the even (0) or odd (1) information of the sum. This is done for each of the four redundant bits in different configurations, such as the first two rows, last two columns, two non-adjacent rows, etc., as shown. We can do this with larger grids, which would cause us to use more redundant bits. In other words, this method is scalable.
Like with the parity bits, if a single error occurs, we can actually track it down using our redundant bits. Because 2â´ = 16, it means that there are 16 different configurations of 1s and 0s for our four redundant bits, meaning that, informationally speaking, we can store enough information about the information to detect an error. That is, we can narrow down the row where the error occurred with the first two bits: given that we have four rows, we can specify the row with two bits (00, 01, 10, 11). Then with the other two bits, we can narrow down the column in which the error occurred in the same way, and from there, flip it back to the position we now know what it used to be.
The code we have described here is more specifically known as a (16,12)-Hamming code, referring to the amount of total data in the block and how many of these bits comprise data.
This would mean that we are saving a huge number of error-detecting bits on the hardware level for each byte of data. But it works, and we donât have to worry as much about data corruption as before. These are only the two most basic examples of error-correcting codes, and many more are increasingly sophisticated. Without them, even small parts of our computer would not work properly, hence why their invention in 1947 (around the time that proper software was being written (instead of digital logic being implemented) makes historical sense and also showcases how important this technology is for the functioning of even the simplest calculations.
[Error Correction](https://medium.com/tag/error-correction?source=post_page-----590431271137---------------------------------------)
[Coding Theory](https://medium.com/tag/coding-theory?source=post_page-----590431271137---------------------------------------)
\--
\--
[](https://medium.com/@365daysofcomputer?source=post_page---post_author_info--590431271137---------------------------------------)
[](https://medium.com/@365daysofcomputer?source=post_page---post_author_info--590431271137---------------------------------------)
[Written by 365 Days of Computer](https://medium.com/@365daysofcomputer?source=post_page---post_author_info--590431271137---------------------------------------)
[25 followers](https://medium.com/@365daysofcomputer/followers?source=post_page---post_author_info--590431271137---------------------------------------)
¡[4 following](https://medium.com/@365daysofcomputer/following?source=post_page---post_author_info--590431271137---------------------------------------)
A project explaining everything about computers. Batches published when they're ready.
## No responses yet
[Help](https://help.medium.com/hc/en-us?source=post_page-----590431271137---------------------------------------)
[Status](https://status.medium.com/?source=post_page-----590431271137---------------------------------------)
[About](https://medium.com/about?autoplay=1&source=post_page-----590431271137---------------------------------------)
[Careers](https://medium.com/jobs-at-medium/work-at-medium-959d1a85284e?source=post_page-----590431271137---------------------------------------)
[Press](mailto:pressinquiries@medium.com)
[Blog](https://blog.medium.com/?source=post_page-----590431271137---------------------------------------)
[Privacy](https://policy.medium.com/medium-privacy-policy-f03bf92035c9?source=post_page-----590431271137---------------------------------------)
[Rules](https://policy.medium.com/medium-rules-30e5502c4eb4?source=post_page-----590431271137---------------------------------------)
[Terms](https://policy.medium.com/medium-terms-of-service-9db0094a1e0f?source=post_page-----590431271137---------------------------------------)
[Text to speech](https://speechify.com/medium?source=post_page-----590431271137---------------------------------------) |
| Readable Markdown | null |
| Shard | 77 (laksa) |
| Root Hash | 13179037029838926277 |
| Unparsed URL | com,medium!/@365daysofcomputer/47-error-detecting-and-correcting-codes-590431271137 s443 |