🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 77 (from laksa180)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄
INDEXABLE
✅
CRAWLED
2 months ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH2.6 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175
Last Crawled2026-01-18 21:13:54 (2 months ago)
First Indexed2024-02-12 10:35:22 (2 years ago)
HTTP Status Code200
Meta TitleConvolutional Neural Networks: A Comprehensive Guide | by Jorgecardete | The Deep Hub | Medium
Meta DescriptionConvolutional Neural Networks: A Comprehensive Guide Exploring the power of CNNs in image analysis Table of contents What are Convolutional Neural Networks? Convolutional …
Meta Canonicalnull
Boilerpipe Text
Exploring the power of CNNs in image analysis 14 min read Feb 7, 2024 Press enter or click to view image in full size Image created by the author with DALL-E 3 Table of contents What are Convolutional Neural Networks? Convolutional layers Channels Stride Padding Pooling Layers Flattening layers Activation functions in CNNs C onvolutional Neural Networks , commonly referred to as CNNs are a specialized type of neural network designed to process and classify images. If you are new to this field you might be thinking how is it possible to classify an image? Well… images are also numbers! Digital images are essentially grids of tiny units called pixels . Each pixel represents the smallest unit of an image and holds information about the color and intensity at that particular point . Pixel representation | Source Typically, each pixel is composed of three values corresponding to the red, green, and blue (RGB) color channels. These values determine the color and intensity of that pixel. You can use the following tool to understand better how the RGB vector is formed: Geogebra RGB tool | Source In contrast, in a grayscale image , each pixel carries a single value that represents the intensity of light at that point. Usually ranging from black (0) to white (255) . Press enter or click to view image in full size Grayscale image | Source How do CNNs work? To understand how a CNN functions let´s recap some of the basic concepts about Neural Networks. (If you are reading this post I am assuming that you are familiar with basic neural networks . If that´s not the case I strongly recommend you to read this article ). 1.- Neurons: The most basic unit in a neural network. They are composed of a sum of linear functions and a non-linear function known as the activation function is applied to them. Press enter or click to view image in full size Neuron representation | Source 2.- Input layer: Each neuron in the input layer corresponds to one of the input features. For instance, in an image classification task where the input is a 28 x 28-pixel image , the input layer would have 784 neurons (one for each pixel). 3.- Hidden Layer: The layers between the input and the output layer. Each neuron in this layer is s ummed by the result of the neurons in the previous layers and multiplied by a non-linear function . 4.- Output Layer: The number of neurons in the output layer corresponds to the number of output classes (In case we are facing a regression problem the output layer will only have one neuron ). For example, in a classification task with digits from 0 to 9 , the output layer would have 10 neurons . Nerual Network process | Source: 3Blue1Brown Once a prediction is made, a loss is calculated and the network enters a self-improvement iterative process through which the weights are adjusted with backpropagation to reduce this error. Now we are ready to understand convolutional neural networks! The first question we should ask ourselves: What makes a CNN different from a basic neural network? Convolutional layers They are the fundamental building blocks of CNNs. These layers perform a critical mathematical operation known as convolution . This process entails the application of specialized filters known as kernels , that traverse through the input image to learn complex visual patterns. Kernels They are essentially small matrices of numbers. These filters move across the image performing element-wise multiplication with the part of the image they cover, extracting features such as edges, textures, and shapes . Kernel operation | Source In the figure above, visualize the input as an image transformed into pixels. We multiply each term of the image by a 3 × 3 matrix (this shape can vary) and pass it into an output matrix . There are various methods to decide the digits inside the kernel. This will depend on the effect you want to achieve such as detecting edges, blurring, sharpening… But what are we doing exactly? Let´s take a deeper look at it. Convolution Operation The convolution operation involves multiplying the kernel value s by the original pixel values of the image and then summing up the results . This is a basic example with a 2 × 2 kernel: We start in the left corner of the input: (0 × 0) + (1 × 1) + (3 × 2) + (4 × 3) = 19 Then we slice one pixel to the right and perform the same operation: (1 × 0) + (2 × 1) + (4 × 2) + (5 × 3 ) = 25 After we completed the first row we move one pixel down and start again from the left: (3 × 0) + (4 × 1) + (6 × 2) + (7 × 3) = 37 Finally, we again slice one pixel to the right: (4 × 0) + (5 × 1) + (7 × 2) + (8 × 3) = 43 The output matrix of this process is known as the Feature map. Perfect, now we understand how this operation works! But… Why is it so useful? We are just multiplying and adding pixels, how can we extract image features doing this? For now, I won´t be diving deeper into the convolution operation because I don´t consider it to be pivotal for understanding Conv. nets in the beginning. However, if you are very curious I will leave you what I believe to be the best public answer to that question: That´s it, you´ve understood the most fundamental concept behind CNNs, Convolutional Layers ! At this point, you may be having a bunch of doubts (at least I had them). I mean, we understand how a convolution works , but: Kernels always traverse through the image matrix one pixel at a time ? What happens with the pixels in the corners , we are only passing over them one time, what if they have an important feature? And what about RGB images ? We stated that they are represented in 3 dimensions , how does the kernel traverse over them? These are a lot of questions but don´t worry, all of them have an easy answer. We’ll start by understanding three essential components inside convolutional layers: Channels Stride Padding 1.- Channels As I explained before, digital images are often composed of three channels (RGB) which are represented in three different matrices. RGB decomposed image | Source For an RGB image, there are typically separate kernels for each color channel because different features might be more visible or relevant in one channel compared to the others. Convolution operation in Red, Green, and Blue channels | Source Depth of the layer The ‘depth’ of a layer refers to the number of kernels it contains. Each filter produces a separate feature map , and the collection of these feature maps forms the complete output of the layer . The output normally has multiple channels, where each channel is a feature map corresponding to a particular kernel. In the case of RGB, we typically use one channel for each of the 3 matrices, but we can add as many as we want. For example , let´s say that you have a gray-scale image of a cat, you could create a channel specialized in detecting the ears and another in the mouth. CNN representation | Source This image illustrates the concept quite well, think of each layer in the convolution as a feature map with a different kernel (don´t worry about the pooling part for now, we`ll break it down in a minute). ☣️ BE CAREFUL with misunderstanding the channels in the convolution layer with the color channels in the image. That was a representative example to understand the concept but you can add as many channels as you want . Each channel will detect a different feature in the image based on the values you assign to its kernel. 2.- Stride We have discussed that in a convolution a kernel moves through the pixels of an image, but we haven´t talked about the different ways in which it can do it. Stride refers to the number of pixels by which a kernel moves across the input image . The example we saw before had a stride of 1, but this can change. Let´s see a visual representation: Stride = 1 Hyperparameters of a Convolutional Layer | Source Stride = 2 Press enter or click to view image in full size Hyperparameters of a Convolutional Layer | Source A stride of 2 not only changes the way the convolution iterates over the input size but also the output by making it smaller (2 × 2). Taking this into account we can conclude that: A larger stride will produce smaller output dimensions (as it covers the input image faster), whereas a smaller stride results in a larger output dimension. But why would we want to change the stride? Increasing the stride will allow the filter to cover a larger area of the input image , which can be useful for capturing more global features . In contrast, lowering the stride will capture finer and more local details . In addition, increasing the stride will control overfitting and reduce computational efficiency as it will reduce the spatial dimensions of the feature map. 3.- Padding Padding refers to the addition of extra pixels around the edge of the input image. When you focus on the pixels in the image’s edges, you’ll notice that we traverse them fewer times compared to those positioned in the center . The purpose of padding is to adjust the spatial size of the output of a convolutional operation and to preserve spatial information at the borders . Get Jorgecardete’s stories in your inbox Join Medium for free to get updates from this writer. Let´s see another example with the CNN explainer Padding = 0 (focus on the edges and count how many times the kernel is passing through them) Hyperparameters of a Convolutional Layer | Source Padding = 1 Hyperparameters of a Convolutional Layer | Source Now we are passing more times through the pixels in the edges and getting more information about them. In which cases do you want to apply padding? Mainly when the edges of the image contain useful information that you want to capture. You can increase the padding up to the kernel size you are using. And how does it affect the output field? Padding increases the size of the output feature map . If you increase the padding while keeping the kernel size and stride constant, the convolution operation has more “room” to take place, resulting in a larger output . The output size of a convolutional layer can be calculated using the following formula: Where “2 × Padding” accounts for padding applied to both the left and right sides (or top and bottom sides) of the input. “+ 1” accounts for the initial position of the filter, which starts at the beginning of the padded input. ☣️ This is a visual explanation of Padding but at a practical level, it doesn´t have to be always the same on all sides of the image . The padding dimensions can be asymmetric or even have a custom padding design. If you have reached this point now you can officially say that you know how Convolutional Layers work! Nevertheless, this is not the end of the journey… There is a common misconception among beginners that Conv. layers are Convolutional Neural Networks. Well, convolutional layers are an essential component, but as its name indicates, they are a LAYER inside CNNs. We have comprehended the most important part of CNNs, but there are still two other special types of layers that we have to understand: Pooling Layers Flattening Layers Pooling Layers Before explaining how these layers work it´s crucial to have this clear : Although Convolutional Layers can decrease the output size, their principal objective is not DIMENSIONALITY REDUCTION . The main objective of Convolutional Layers is FEATURE EXTRACTION . In fact, in most cases we are not reducing the dimensions of our data because we are creating new channels that weren´t there before, so even if our feature map dimensions are smaller, we have more of them . Convolutional neural network representation | Source Take a look at this example, here we might be reducing a bit our feature map in each Convolutional Layer but we are creating much more channels. What about the subsampling layers? Those are pooling layers and its main objective is indeed dimensionality reduction! How Pooling Layers Work Imagine you have a large image and want to make it smaller but keep a ll the important features like edges and colors. The pooling layer operates independently on every depth slice of the input. It resizes it spatially, using the Max or Average of the values in a window slid over the input data. Press enter or click to view image in full size Max and Avg Pooling Layers | Source In this example, we have reduced the feature map from (4 × 4) to (2 × 2). What is the difference between pooling and the convolution operation? In pooling , we are not applying any kernel to the input data, we are just simplifying the information with a math operation (Max or Avg). What about the channels, pooling also reduces the number of channels? You must understand this: Pooling layers DO NOT REDUCE THE NUMBER OF CHANNELS . Each pooling operation IS APPLIED INDEPENDENTLY TO EACH CHANNEL of the input data. Let´s see another example, channels can be a bit complex to visualize at first and I want to ensure that you understand correctly how they work. Layers inside a CNN | Source This is a good representation, here you can see how each pooling layer is reducing the dimensions of the spatial space but it's not reducing the number of channels . The number of channels is not reduced until the end of the architecture. With Convolutional and Pooling layers we CAN´T reduce the number of channels , just add more to the existing ones. So why and how do we combine all these channels? After convolutional and pooling layers have extracted relevant features from the input image we have to turn this high-dimensional feature map into a format suitable for feeding into fully connected layers. Here´s where flattening layers come into action! Flattening layers Imagine you have a grid of data (like pixels in a feature map), and you want to line up all of these grid points in a single, long line. That’s what flattening does. It takes the entire feature map and reorganizes it into a single, long vector . Press enter or click to view image in full size Flattening concept | Source ☣️ Although flattening changes the shape of the data, it does not make any changes to the actual information. Why do we need flattening layers? Integration of features By flattening the feature maps into a vector, the network can integrate the spatially distributed features extracted for tasks such as classification. Compatibility with Dense Layers Fully connected layers (dense layers) are designed to operate on 1-dimensional data , hence, flattening is a necessary step to transition from the multidimensional tensors produced by convolutional layers to the format required for dense layers. This leads us to our next question: Why do we need Dense Layers in CNNs? While convolutional layers are good at detecting features in input data, dense layers are essential for integrating these features into predictions . For example, if we design a convolutional neural network for facial recognition , early layers might detect edges and textures , while dense layers might interpret these to recognize specific facial features . Without dense layers, CNNs would not be able to perform the high-level tasks that are often required, such as classifying images , detecting objects , or making predictions based on visual inputs. CNN recap Up to this point, we have revised the whole CNN structure: Convolutional Layers Pooling layers Flattening layers Dense layers With the fundamental concepts of channels, stride , and pooling . We could say that we have joined all the pieces of the puzzle! Or maybe not… what about activation functions and backpropagation? Backpropagation functions similarly in feed-forward neural networks but with some special adjustments. I won´t focus much on its technical details. You can check out this very interesting article to know more about it! If you know nothing about Backpropagation you can start by taking a look at my publication: However, I will certainly take a look at activation functions . Activation functions in Convolutional Neural Networks As you may know activation functions are indispensable, otherwise, we would be creating a very large linear model. As in simple neural networks, we also need these non-linear terms in ConvNets. However, not all the layers we have seen have an activation function. Let's use an image as a reference to visualize this. Now you should understand the representation without any problem! Just one little thing… The first two pooling layers are not shown in this diagram, this is another way of visualizing CNNs, it doesn´t mean that they are not there, just imagine a filter between each layer that makes them smaller . Press enter or click to view image in full size Complete CNN representation | Source In the feature extraction part, the activations will be in the convolutional layers . The process is quite straightforward, after each convolution operation you multiply the result by an activation function. Press enter or click to view image in full size Convolutional layer structure | Source The pooling and flattening layers DON´T have an activation function . As we explained before the main function of pooling layers is dimensionality reduction and the main purpose of flattening layers is restructuring the data into a 1D vector . We don´t need to include non-linearities for doing that. Nevertheless, we do need activation functions for extracting complex features (we won´t be able to capture relevant characteristics of an image with only a linear function). In the classification part , all the fully connected layers and the output layer will have an activation function, as in simple neural nets. Here we also need an activation function because we are using the features extracted to make a classification or a prediction, and the algorithm has to learn complex interactions (as a simple neural network would do). Thanks for reading! If you like the article make sure to clap (up to 50!) and follow me on Medium to stay updated with my new articles. Also, make sure to follow my new publication!
Markdown
[Sitemap](https://medium.com/sitemap/sitemap.xml) [Open in app](https://play.google.com/store/apps/details?id=com.medium.reader&referrer=utm_source%3DmobileNavBar&source=post_page---top_nav_layout_nav-----------------------------------------) Sign up [Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2Fthedeephub%2Fconvolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175&source=post_page---top_nav_layout_nav-----------------------global_nav------------------) [Medium Logo](https://medium.com/?source=post_page---top_nav_layout_nav-----------------------------------------) [Write](https://medium.com/m/signin?operation=register&redirect=https%3A%2F%2Fmedium.com%2Fnew-story&source=---top_nav_layout_nav-----------------------new_post_topnav------------------) [Search](https://medium.com/search?source=post_page---top_nav_layout_nav-----------------------------------------) Sign up [Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2Fthedeephub%2Fconvolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175&source=post_page---top_nav_layout_nav-----------------------global_nav------------------) ![](https://miro.medium.com/v2/resize:fill:32:32/1*dmbNkD5D-u45r44go_cf0g.png) [Mastodon](https://me.dm/@thedeephub) [The Deep Hub](https://medium.com/thedeephub?source=post_page---publication_nav-d0b4131403b5-5cc0b5eae175---------------------------------------) · Follow publication [![The Deep Hub](https://miro.medium.com/v2/resize:fill:38:38/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---post_publication_sidebar-d0b4131403b5-5cc0b5eae175---------------------------------------) Your data science hub. A Medium publication dedicated to exchanging ideas and empowering your knowledge. Follow publication 1 1 Top highlight 2 1 1 1 1 1 # Convolutional Neural Networks: A Comprehensive Guide ## Exploring the power of CNNs in image analysis [![Jorgecardete](https://miro.medium.com/v2/resize:fill:32:32/1*Kvrg_Y7d2jM7_SPFQuFXfA.jpeg)](https://medium.com/@jorgecardete?source=post_page---byline--5cc0b5eae175---------------------------------------) [Jorgecardete](https://medium.com/@jorgecardete?source=post_page---byline--5cc0b5eae175---------------------------------------) Follow 14 min read · Feb 7, 2024 3\.3K 45 [Listen](https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2Fplans%3Fdimension%3Dpost_audio_button%26postId%3D5cc0b5eae175&operation=register&redirect=https%3A%2F%2Fmedium.com%2Fthedeephub%2Fconvolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175&source=---header_actions--5cc0b5eae175---------------------post_audio_button------------------) Share Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*dc2iT7aRXaNv3ZEq) Image created by the author with DALL-E 3 ### **Table of contents** 1. [What are Convolutional Neural Networks?](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#6caf) 2. [Convolutional layers](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#b6fa) 3. [Channels](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#316d) 4. [Stride](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#8a64) 5. [Padding](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#d74c) 6. [Pooling Layers](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#83ba) 7. [Flattening layers](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#7285) 8. [Activation functions in CNNs](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#fc3b) C**onvolutional Neural Networks**, commonly referred to as **CNNs** are a specialized type of neural network designed to process and classify images. If you are new to this field you might be thinking **how is it possible to classify an image?** Well… **images are also numbers\!** Digital images are essentially **grids of tiny units called pixels**. Each pixel represents the smallest unit of an image and holds information **about the color and intensity at that particular point**. ![](https://miro.medium.com/v2/resize:fit:355/0*Su2NsuI37Lii-OCK.png) Pixel representation \| [Source](https://medium.com/r?url=https%3A%2F%2Fwww.javatpoint.com%2Fconcept-of-pixel) Typically, each pixel is composed of three values corresponding to the **red, green, and blue (RGB)** color channels. These values determine the **color and intensity** of that pixel. > You can use the following [tool](https://www.geogebra.org/m/Dq2A7aRv) to understand better **how the RGB vector is formed:** ![](https://miro.medium.com/v2/resize:fit:383/1*NyhH__Pkvlxgt0iGPgg0UQ.png) Geogebra RGB tool \| [Source](https://medium.com/r?url=https%3A%2F%2Fwww.geogebra.org%2Fm%2FDq2A7aRv) In contrast, in a **grayscale image**, each pixel carries a single value that represents the intensity of light at that point. > Usually ranging **from black (0) to white (255)**. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*IjuQtgtesflU7sLy.png) Grayscale image \| [Source](https://medium.com/r?url=https%3A%2F%2Fwww.nzfaruqui.com%2Ftag%2Faccessing-the-pixel-value-grayscale-image%2F) ### How do CNNs work? To understand how a CNN functions let´s recap some of the basic concepts about Neural Networks. > (If you are reading this post I am assuming that you are familiar with **basic neural networks**. If that´s not the case I strongly recommend you to read this [article](https://towardsdatascience.com/understanding-neural-networks-19020b758230)). 1\.- **Neurons:** The most basic unit in a neural network. They are composed of a **sum of linear functions** and a **non-linear function** known as the **activation function** is applied to them. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*9ZTsY5m2Lksbl8O7.png) Neuron representation \| [Source](https://thedatafrog.com/en/articles/logistic-regression/) 2\.- **Input layer:** Each neuron in the input layer corresponds to one of the input features. > For instance, in an image classification task where the input is a **28 x 28-pixel image**, the input layer would have **784 neurons** (one for each pixel). 3\.- **Hidden Layer:** The layers between the input and the output layer. Each neuron in this layer is s**ummed** by the result of the neurons in the previous layers and multiplied by a **non-linear function**. 4\.- **Output Layer:** The number of neurons in the output layer corresponds to the number of output classes (In case we are facing a **regression** problem the output layer will only have **one neuron**). > For example, in a classification task with digits **from 0 to 9**, the output layer would have **10 neurons**. ![](https://miro.medium.com/v2/resize:fit:620/0*XoYvsfn9EBBQHHhM.gif) Nerual Network process \| Source: 3Blue1Brown Once a prediction is made, a **loss** is calculated and the network enters a **self-improvement iterative process** through which the weights are adjusted with [**backpropagation**](https://medium.com/towards-artificial-intelligence/backpropagation-2eeb25201095)to reduce this error. Now we are ready **to understand convolutional neural networks\!** The first question we should ask ourselves: - What makes a CNN different from a basic neural network? ### Convolutional layers They are the fundamental building blocks of CNNs. These layers perform a critical mathematical operation known as **convolution**. This process entails the application of **specialized filters known as kernels**, that traverse through the input image to learn complex visual patterns. **Kernels** They are essentially small matrices of numbers. These filters move across the image performing **element-wise multiplication** with the part of the image they cover, extracting features such as **edges, textures, and shapes**. ![](https://miro.medium.com/v2/resize:fit:390/0*Kvc8f4nHC0Vg_WaG.gif) Kernel operation \| [Source](https://medium.com/r?url=https%3A%2F%2Fcommons.wikimedia.org%2Fwiki%2FFile%3A2D_Convolution_Animation.gif) In the figure above, visualize the input as an image transformed into pixels. We multiply each term of the image by a 3 *×* 3 matrix (this shape can vary) **and pass it into an output matrix**. There are various methods to decide the digits inside the kernel. This will depend on the effect you want to achieve such as detecting edges, blurring, sharpening… > But what are we doing exactly? Let´s take a deeper look at it. **Convolution Operation**The convolution operation involves multiplying**the kernel value**s by the**original pixel values**of the image and then**summing up the results**. This is a basic example with a 2 *×* 2 kernel: ![](https://miro.medium.com/v2/resize:fit:363/0*4RTQZqKqrhsinDu-) We start in the left corner of the input: - *(0 × 0) + (1 × 1) + (3 × 2) + (4 × 3) =* ***19*** Then we slice one pixel to the right and perform the same operation: - *(1 × 0) + (2 × 1) + (4 × 2) + (5 × 3 ) =* ***25*** After we completed the first row we move one pixel down and start again from the left: - *(3 × 0) + (4 × 1) + (6 × 2) + (7 × 3) =* ***37*** Finally, we again slice one pixel to the right: - *(4 × 0) + (5 × 1) + (7 × 2) + (8 × 3) =* ***43*** The output matrix of this process is known as the**Feature map.** Perfect, now we understand how **this operation works\!** But… > Why is it so useful? We are just multiplying and adding pixels, how can we extract image features doing this? For now, I won´t be diving deeper into the convolution operation because I don´t consider it to be pivotal for understanding Conv. nets in the beginning. However, if you are very curious I will leave you what I believe to be **the best public answer** to that question: That´s it, you´ve understood the most fundamental concept behind CNNs,**Convolutional Layers**\! At this point, you may be having a bunch of doubts (at least I had them). I mean, we understand **how a convolution works**, but: - Kernels always traverse through the image matrix **one pixel at a time**? - What happens with the **pixels in the corners** , we are only passing over them one time, what if they have an important feature? - And what about **RGB images** ? We stated that they are represented in **3 dimensions** , how does the kernel traverse over them? These are a lot of questions but don´t worry, all of them have an easy answer. We’ll start by understanding **three essential components** inside convolutional layers: 1. [*Channels*](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#316d) 2. [*Stride*](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#8a64) 3. [*Padding*](https://medium.com/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175#d74c) ### 1\.- Channels As I explained before, digital images are often composed of **three channels (RGB)** which are represented in three different matrices. ![](https://miro.medium.com/v2/resize:fit:560/1*whABJWYKeolCwllDcpg85w.png) RGB decomposed image \| [Source](https://medium.com/r?url=https%3A%2F%2Fwww.geeksforgeeks.org%2Fmatlab-rgb-image-representation%2F) For an RGB image, there are typically **separate kernels for each color channel** because different features might be more visible or relevant in one channel compared to the others. ![](https://miro.medium.com/v2/0*N7YTWQ-7P7s6Z9DZ.gif) Convolution operation in Red, Green, and Blue channels \| [Source](https://medium.com/r?url=https%3A%2F%2Ftowardsdatascience.com%2Fintuitively-understanding-convolutions-for-deep-learning-1f6f42faee1) - **Depth of the layer** The**‘depth’**of a layer refers to the number of kernels it contains. Each filter produces a separate**feature map**, and the collection of these feature maps forms the***complete output of the layer***. > The output normally has multiple channels, where each channel is a feature map corresponding to a particular kernel. In the case of RGB, we typically use **one channel** for each of the 3 matrices, but we can add as many as we want. > **For example**, let´s say that you have a gray-scale image of a cat, you could create a channel specialized in detecting the ears and another in the mouth. ![](https://miro.medium.com/v2/resize:fit:640/1*-FR6rFrKXktjxwDTlGofPQ.png) CNN representation \| [Source](https://medium.com/r?url=https%3A%2F%2Fwww.topcoder.com%2Fthrive%2Farticles%2Foverview-of-convolutional-neural-networks%3Futm_source%3Dthrive%26utm_campaign%3Dthrive-feed%26utm_medium%3Drss-feed) This image illustrates the concept quite well, think of each layer in the convolution as a feature map with a different kernel (don´t worry about the pooling part for now, we\`ll break it down in a minute). > **☣️ BE CAREFUL** with misunderstanding the channels in the convolution layer with the color channels in the image. That was a representative example to understand the concept but **you can add as many channels as you want**. > > Each channel will detect a **different feature** in the image based on the values you assign to its kernel. ### 2\.- Stride We have discussed that in a convolution a kernel moves through the pixels of an image, but we haven´t talked about the different ways in which it can do it. Stride refers to **the number of pixels by which a kernel moves across the input image**. The example we saw before had a stride of 1, but this can change. Let´s see a visual representation: - Stride = 1 ![](https://miro.medium.com/v2/resize:fit:600/0*PwwAAFrpat2zNcsV.gif) Hyperparameters of a Convolutional Layer \| [Source](https://poloclub.github.io/cnn-explainer/) - Stride = 2 Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:600/0*CvEOVhChtGfcgZFb.gif) Hyperparameters of a Convolutional Layer \| [Source](https://poloclub.github.io/cnn-explainer/) A stride of 2 not only changes the way the convolution iterates over the input size but also the output by making it smaller (2 *×* 2). Taking this into account we can conclude that: > A **larger stride** will produce smaller output dimensions (as it covers the input image faster), whereas a **smaller stride** results in a larger output dimension. > But why would we want to change the stride? **Increasing** the stride will allow the filter to cover a **larger area of the input image**, which can be useful for capturing **more global features**. In contrast, **lowering** the stride will capture **finer and more local details**. In addition, increasing the stride will control**overfitting**and**reduce computational efficiency**as it will reduce the spatial dimensions of the feature map. ### 3\.- Padding Padding refers to the **addition of extra pixels around the edge** of the input image. When you focus on the pixels in the image’s edges, you’ll notice that **we traverse them fewer times** compared to those **positioned in the center**. The purpose of padding is to **adjust the spatial size** of the output of a convolutional operation and to **preserve spatial information at the borders**. ## Get Jorgecardete’s stories in your inbox Join Medium for free to get updates from this writer. Subscribe Subscribe Let´s see another example with the [CNN explainer](https://poloclub.github.io/cnn-explainer/) - Padding = 0 (focus on the edges and count how many times the kernel is passing through them) ![](https://miro.medium.com/v2/resize:fit:600/0*zAqkOGriv74N7K_d.gif) Hyperparameters of a Convolutional Layer \| [Source](https://poloclub.github.io/cnn-explainer/) - Padding = 1 ![](https://miro.medium.com/v2/resize:fit:600/0*9EyOto1wcaYsbT-T.gif) Hyperparameters of a Convolutional Layer \| [Source](https://poloclub.github.io/cnn-explainer/) Now we are passing more times through the pixels in the edges and getting more information about them. > In which cases do you want to apply padding? Mainly when the edges of the image **contain useful information** that you want to capture. > You can increase the padding up to the kernel size you are using. > And how does it affect the output field? Padding **increases the size of the output feature map**. If you increase the padding while keeping the kernel size and stride constant, the convolution operation has more “room” to take place, **resulting in a larger output**. The output size of a convolutional layer can be calculated using the following formula: ![](https://miro.medium.com/v2/resize:fit:540/0*KkBxsHaa_8sZZtEs) Where - **“2 × Padding”** accounts for padding applied to both the left and right sides (or top and bottom sides) of the input. - **“+ 1”** accounts for the initial position of the filter, which starts at the beginning of the padded input. > ☣️ This is a visual explanation of Padding but at a practical level, it doesn´t have to be **always the same on all sides of the image**. > > The padding dimensions can be **asymmetric** or even have a **custom padding** design. If you have reached this point now you can officially say that you know how Convolutional Layers work\! Nevertheless, **this is not the end of the journey…** There is a common misconception among beginners that Conv. layers are Convolutional Neural Networks. Well, convolutional layers are an essential component, but as its name indicates, they are a **LAYER** inside CNNs. We have comprehended the most important part of CNNs, but there are still **two other special types of layers** that we have to understand: - Pooling Layers - Flattening Layers ### Pooling Layers Before explaining how these layers work **it´s crucial to have this clear**: > Although Convolutional Layers can decrease the output size, their principal objective is not **DIMENSIONALITY REDUCTION**. > > The main objective of Convolutional Layers is**FEATURE EXTRACTION**. In fact, in most cases we are **not reducing the dimensions** of our data because we are creating **new channels** that weren´t there before, so even if our feature map dimensions are smaller, **we have more of them**. ![Beginners Guide to Convolutional Neural Network from… – Towards AI](https://miro.medium.com/v2/1*3BRLw4lsANPEfGgimG3YVQ.png) Convolutional neural network representation \| [Source](https://en.wikipedia.org/wiki/Convolutional_neural_network) Take a look at this example, here we might be reducing a bit our feature map in each Convolutional Layer but we are creating much more channels. > What about the subsampling layers? Those are pooling layers and its main objective is indeed **dimensionality reduction\!** ### **How Pooling Layers Work** Imagine you have a large image and want to make it smaller but keep a**ll the important features** like edges and colors. The pooling layer operates independently on every depth slice of the input. It resizes it spatially, using the **Max** or **Average** of the values in a window slid over the input data. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*Lja1hLtBQS6H1-qI.gif) **Max** and **Avg** Pooling Layers \| [Source](https://medium.com/r?url=https%3A%2F%2Fpub.towardsai.net%2Fintroduction-to-pooling-layers-in-cnn-dafe61eabe34) In this example, we have reduced the feature map from (4 *×* 4\) to (2 *×* 2\). > What is the difference between pooling and the convolution operation? In **pooling**, we are not applying any kernel to the input data, we are just **simplifying the information** with a math operation (Max or Avg). > What about the channels, pooling also reduces the number of channels? You must understand this: > Pooling layers **DO NOT REDUCE THE NUMBER OF CHANNELS**. > > Each pooling operation **IS APPLIED INDEPENDENTLY TO EACH CHANNEL** of the input data. Let´s see another example, channels can be a bit complex to visualize at first and I want to ensure that you understand correctly how they work. ![](https://miro.medium.com/v2/resize:fit:624/0*LBommOWPlEow8rbE.png) Layers inside a CNN \| [Source](https://jacobheyman702.medium.com/different-pooling-layers-for-cnn-4652a5103d62%C3%A7) This is a good representation, here you can see how each pooling layer is **reducing the dimensions of the spatial space** but it's not **reducing the number of channels**. The number of channels is not reduced until the end of the architecture. > With Convolutional and Pooling layers **we CAN´T reduce the number of channels**, just add more to the existing ones. > So why and how do we combine all these channels? After **convolutional** and **pooling** layers have **extracted relevant features** from the input image we have to turn this high-dimensional feature map into a format suitable for feeding into fully connected layers. Here´s where **flattening layers come into action\!** ### Flattening layers Imagine you have a grid of data (like pixels in a feature map), and you want to line up all of these grid points in a single, long line. That’s what flattening does. It takes the entire feature map and reorganizes it into a **single, long vector**. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*JDM5bgdLuwvDjhcZ.png) Flattening concept \| [Source](https://www.superdatascience.com/blogs/convolutional-neural-networks-cnn-step-3-flattening) > ☣️ Although flattening changes the shape of the data, it does not make any changes to the actual information. - Why do we need flattening layers? **Integration of features** By flattening the feature maps into a vector, the network can integrate the spatially distributed features extracted for tasks such as classification. **Compatibility with Dense Layers** Fully connected layers (dense layers) are designed to operate on **1-dimensional data**, hence, flattening is a necessary step to transition from the multidimensional tensors produced by convolutional layers to the format required for dense layers. This leads us to our next question: > Why do we need Dense Layers in CNNs? While convolutional layers are good at **detecting features** in input data, dense layers are essential for **integrating these features into predictions**. For example, if we design a convolutional neural network for **facial recognition**, early layers might detect **edges and textures**, while dense layers might interpret these to **recognize specific facial features**. > Without dense layers, CNNs would not be able to perform the **high-level tasks** that are often required, such as **classifying images**, **detecting objects**, or **making predictions** based on visual inputs. ### CNN recap Up to this point, we have revised the whole CNN structure: - Convolutional Layers - Pooling layers - Flattening layers - Dense layers With the fundamental concepts of **channels, stride**, and **pooling**. We could say that we have joined all the pieces of the puzzle\! Or maybe not… **what about activation functions and backpropagation?** **Backpropagation** functions similarly in feed-forward neural networks but with some special adjustments. I won´t focus much on its technical details. > You can check out this very interesting article to know more about it\! [Convolutions and Backpropagations Ever since AlexNet won the ImageNet competition in 2012, Convolutional Neural Networks (CNNs) have become ubiquitous… pavisj.medium.com](https://pavisj.medium.com/convolutions-and-backpropagations-46026a8f5d2c?source=post_page-----5cc0b5eae175---------------------------------------) > If you know nothing about Backpropagation you can start by taking a look at my publication: [Backpropagation From Mystery to Mastery: Decoding the engine behind Neural Networks. medium.com](https://medium.com/@jorgecardete/backpropagation-2eeb25201095?source=post_page-----5cc0b5eae175---------------------------------------) However, I will certainly take a look at **activation functions**. ### Activation functions in Convolutional Neural Networks As you may know activation functions are indispensable, otherwise, we would be creating a very large linear model. As in simple neural networks, we also need these **non-linear terms** in ConvNets. However, **not all the layers we have seen have an activation function.** Let's use an image as a reference to visualize this. Now you should understand the representation without any problem! Just one little thing… > The first two **pooling layers** are not shown in this diagram, this is another way of visualizing CNNs, it doesn´t mean that they are not there, just imagine a **filter between each layer that makes them smaller**. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:1000/0*so8rajH9EF43_OGc) Complete CNN representation \| [Source](https://developersbreach.com/convolution-neural-network-deep-learning/) In the **feature extraction** part, the activations will be in the **convolutional layers**. The process is quite straightforward, after each convolution operation you multiply the result by an activation function. Press enter or click to view image in full size ![](https://miro.medium.com/v2/resize:fit:700/0*HRiuP0ZsyHkRR5AL.png) Convolutional layer structure \| [Source](https://learnopencv.com/understanding-convolutional-neural-networks-cnn/) The **pooling** and **flattening** layers **DON´T have an activation function**. As we explained before the main function of pooling layers is **dimensionality reduction** and the main purpose of flattening layers is **restructuring the data into a 1D vector**. We **don´t need to include non-linearities** for doing that. Nevertheless, we do need activation functions for extracting **complex features** (we won´t be able to capture relevant characteristics of an image with only a linear function). In the **classification part**, all the fully connected layers and the output layer will have an activation function, as in simple neural nets. Here we also need an activation function because we are using the features extracted to make a classification or a prediction, and the algorithm has to **learn complex interactions** (as a simple neural network would do). ### **Activations — Convolutional and dense layers** **ReLU:** is the most common activation function. It outputs the input directly **if it is positive**, otherwise, it **outputs zero**. It has the benefit of **reducing training time** and mitigating the **vanishing gradient problem**. **Leaky ReLU:** A variation of ReLU that allows a **small, non-zero gradient** when the unit is inactive, which can help **prevent dead neurons** during training. [Activation functions: ReLU vs. Leaky ReLU It’s never too late to board the ‘Learning and discussing the insights’ train, and here are my two cents on my recent… medium.com](https://medium.com/mlearning-ai/activation-functions-relu-vs-leaky-relu-b8272dc0b1be?source=post_page-----5cc0b5eae175---------------------------------------) ### **Activations — Output layer** **Sigmoid:** Produces an output in the **range (0, 1)**. It’s **not commonly used in hidden layers anymore** due to the vanishing gradient problem, but it’s still used for **binary classification** in the output layer. **Tanh (Hyperbolic Tangent):** Output values in the **range (-1, 1)**. It is similar to the sigmoid but can provide better training performance for some problems due to its output range. [Activation Functions in Neural Networks Sigmoid, tanh, Softmax, ReLU, Leaky ReLU EXPLAINED !!! towardsdatascience.com](https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6?source=post_page-----5cc0b5eae175---------------------------------------) ### Bibliography 1. [***Polo Club of Data Science****. (2020). CNN Explainer.*](https://poloclub.github.io/cnn-explainer/) 2. [***IBM****. (2020). Convolutional Neural Networks.*](https://www.ibm.com/topics/convolutional-neural-networks) 3. [***Saha, S.*** *(2018). A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way. Towards Data Science.*](https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53) 4. [***Cireșan, D. C.*** *(2016). Convolutional Neural Networks for Visual Recognition. Springer International Publishing.*](https://doi.org/10.1007/978-3-319-57550-6) 5. [***DeepLearning.TV.*** *(2019). Convolutional Neural Networks (CNNs) explained. \[Video\]. YouTube.*](https://www.youtube.com/watch?v=KuXjwB4LzSA&t=22s) *Thanks for reading! If you like the article make sure to clap (up to 50!) and follow me on* [*Medium*](https://medium.com/@jorgecardete) *to stay updated with my new articles.* Also, make sure to follow **my** **new publication\!** [The Deep Hub Your data science hub. A Medium publication dedicated to exchanging ideas and empowering your knowledge. medium.com](https://medium.com/thedeephub?source=post_page-----5cc0b5eae175---------------------------------------) [AI](https://medium.com/tag/ai?source=post_page-----5cc0b5eae175---------------------------------------) [Machine Learning](https://medium.com/tag/machine-learning?source=post_page-----5cc0b5eae175---------------------------------------) [Data Science](https://medium.com/tag/data-science?source=post_page-----5cc0b5eae175---------------------------------------) [Deep Learning](https://medium.com/tag/deep-learning?source=post_page-----5cc0b5eae175---------------------------------------) [Cnn](https://medium.com/tag/cnn?source=post_page-----5cc0b5eae175---------------------------------------) 3\.3K 3\.3K 45 [![The Deep Hub](https://miro.medium.com/v2/resize:fill:48:48/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---post_publication_info--5cc0b5eae175---------------------------------------) [![The Deep Hub](https://miro.medium.com/v2/resize:fill:64:64/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---post_publication_info--5cc0b5eae175---------------------------------------) Follow [Published in The Deep Hub](https://medium.com/thedeephub?source=post_page---post_publication_info--5cc0b5eae175---------------------------------------) [2\.1K followers](https://medium.com/thedeephub/followers?source=post_page---post_publication_info--5cc0b5eae175---------------------------------------) ·[Last published Dec 11, 2025](https://medium.com/thedeephub/the-mandate-of-measurement-agentic-observability-as-the-key-to-llm-optimization-206c6b8c3ba3?source=post_page---post_publication_info--5cc0b5eae175---------------------------------------) Your data science hub. A Medium publication dedicated to exchanging ideas and empowering your knowledge. Follow [![Jorgecardete](https://miro.medium.com/v2/resize:fill:48:48/1*Kvrg_Y7d2jM7_SPFQuFXfA.jpeg)](https://medium.com/@jorgecardete?source=post_page---post_author_info--5cc0b5eae175---------------------------------------) [![Jorgecardete](https://miro.medium.com/v2/resize:fill:64:64/1*Kvrg_Y7d2jM7_SPFQuFXfA.jpeg)](https://medium.com/@jorgecardete?source=post_page---post_author_info--5cc0b5eae175---------------------------------------) Follow [Written by Jorgecardete](https://medium.com/@jorgecardete?source=post_page---post_author_info--5cc0b5eae175---------------------------------------) [2\.4K followers](https://medium.com/@jorgecardete/followers?source=post_page---post_author_info--5cc0b5eae175---------------------------------------) ·[4K following](https://medium.com/@jorgecardete/following?source=post_page---post_author_info--5cc0b5eae175---------------------------------------) AI enthusiast - I write as I learn 🚀 <https://medium.com/thedeephub> Follow ## Responses (45) ![](https://miro.medium.com/v2/resize:fill:32:32/1*dmbNkD5D-u45r44go_cf0g.png) Write a response [What are your thoughts?](https://medium.com/m/signin?operation=register&redirect=https%3A%2F%2Fmedium.com%2Fthedeephub%2Fconvolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175&source=---post_responses--5cc0b5eae175---------------------respond_sidebar------------------) Cancel Respond [![@Digital\_Solution](https://miro.medium.com/v2/resize:fill:32:32/1*SYoHKm4qDJDdTk33NMJbLQ.png)](https://medium.com/@Digital_Solution?source=post_page---post_responses--5cc0b5eae175----0-----------------------------------) [@Digital\_Solution](https://medium.com/@Digital_Solution?source=post_page---post_responses--5cc0b5eae175----0-----------------------------------) [Feb 9, 2024](https://medium.com/@Digital_Solution/very-detailed-and-comprehensive-article-thanks-for-sharing-7e85bb6df1e9?source=post_page---post_responses--5cc0b5eae175----0-----------------------------------) Thanks for reading! If you like the article make sure to clap (up to 50!) and follow me on Medium to stay updated with my new articles. ``` very detailed and comprehensive article. Thanks for sharing. ``` 61 Reply [![Michael Ehrig](https://miro.medium.com/v2/resize:fill:32:32/0*c5RPIM26_3_bFYPw)](https://medium.com/@michael.ehrig?source=post_page---post_responses--5cc0b5eae175----1-----------------------------------) [Michael Ehrig](https://medium.com/@michael.ehrig?source=post_page---post_responses--5cc0b5eae175----1-----------------------------------) [Feb 10, 2024](https://medium.com/@michael.ehrig/they-are-not-multiplied-by-the-non-linear-function-117a6e21d8f7?source=post_page---post_responses--5cc0b5eae175----1-----------------------------------) sum of linear functions multiplied by a non-linear function ``` They are not multiplied by the non-linear function. The non-linear function is applied to the sum of linear functions ``` 11 1 reply Reply [![Shima](https://miro.medium.com/v2/resize:fill:32:32/1*QFsZLW7kUv7AGxmRDoUOmA.jpeg)](https://medium.com/@shb8086?source=post_page---post_responses--5cc0b5eae175----2-----------------------------------) [Shima she/her](https://medium.com/@shb8086?source=post_page---post_responses--5cc0b5eae175----2-----------------------------------) [Feb 9, 2024](https://medium.com/@shb8086/your-explanation-about-cnns-is-easy-to-understand-and-informative-28d4395c3cf3?source=post_page---post_responses--5cc0b5eae175----2-----------------------------------) ``` Your explanation about CNNs is easy to understand and informative. Adding real-life examples and details about how CNNs are trained could improve it. ``` 19 Reply See all responses ## More from Jorgecardete and The Deep Hub ![Backpropagation](https://miro.medium.com/v2/resize:fit:679/format:webp/1*zM2cHNJjHEbJZYk3ZPHbSg.png) [![Towards AI](https://miro.medium.com/v2/resize:fill:20:20/1*JyIThO-cLjlChQLb6kSlVQ.png)](https://medium.com/towards-artificial-intelligence?source=post_page---author_recirc--5cc0b5eae175----0---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) In [Towards AI](https://medium.com/towards-artificial-intelligence?source=post_page---author_recirc--5cc0b5eae175----0---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) by [Jorgecardete](https://medium.com/@jorgecardete?source=post_page---author_recirc--5cc0b5eae175----0---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) [BackpropagationFrom mystery to mastery: Decoding the engine behind Neural Networks.](https://medium.com/towards-artificial-intelligence/backpropagation-2eeb25201095?source=post_page---author_recirc--5cc0b5eae175----0---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) Nov 1, 2023 [A clap icon 1.2KA response icon 8](https://medium.com/towards-artificial-intelligence/backpropagation-2eeb25201095?source=post_page---author_recirc--5cc0b5eae175----0---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) ![Positional Encoding Explained: A Deep Dive into Transformer PE](https://miro.medium.com/v2/resize:fit:679/format:webp/0*hjs_m5PSrB-AT25M) [![The Deep Hub](https://miro.medium.com/v2/resize:fill:20:20/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----1---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) In [The Deep Hub](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----1---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) by [Nikhil Chowdary Paleti](https://medium.com/@nikhil2362?source=post_page---author_recirc--5cc0b5eae175----1---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) [Positional Encoding Explained: A Deep Dive into Transformer PEPositional encoding is a crucial component of transformer models, yet it’s often overlooked and not given the attention it deserves. Many…](https://medium.com/thedeephub/positional-encoding-explained-a-deep-dive-into-transformer-pe-65cfe8cfe10b?source=post_page---author_recirc--5cc0b5eae175----1---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) Jul 5, 2024 [A clap icon 115A response icon 4](https://medium.com/thedeephub/positional-encoding-explained-a-deep-dive-into-transformer-pe-65cfe8cfe10b?source=post_page---author_recirc--5cc0b5eae175----1---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) ![All you need to know about Tokenization in LLMs](https://miro.medium.com/v2/resize:fit:679/format:webp/1*M6PNfzYtNu5Ub6ZB6MYuTg.jpeg) [![The Deep Hub](https://miro.medium.com/v2/resize:fill:20:20/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----2---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) In [The Deep Hub](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----2---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) by [Tayyib Ul Hassan Gondal](https://medium.com/@tayyibgondal2003?source=post_page---author_recirc--5cc0b5eae175----2---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) [All you need to know about Tokenization in LLMsIn this blog, I’ll explain everything about tokenization, which is an important step before pre-training a large language model (LLM). By…](https://medium.com/thedeephub/all-you-need-to-know-about-tokenization-in-llms-7a801302cf54?source=post_page---author_recirc--5cc0b5eae175----2---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) Jul 4, 2024 [A clap icon 154A response icon 2](https://medium.com/thedeephub/all-you-need-to-know-about-tokenization-in-llms-7a801302cf54?source=post_page---author_recirc--5cc0b5eae175----2---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) ![The Art and Science of Interpolation](https://miro.medium.com/v2/resize:fit:679/format:webp/0*HyORsUKyjvgiXVmr) [![The Deep Hub](https://miro.medium.com/v2/resize:fill:20:20/1*pcIeNXCpxLmBPdFGsRhIcg.jpeg)](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----3---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) In [The Deep Hub](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175----3---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) by [Jorgecardete](https://medium.com/@jorgecardete?source=post_page---author_recirc--5cc0b5eae175----3---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) [The Art and Science of InterpolationExploring the pillars of image processing](https://medium.com/thedeephub/the-art-and-science-of-interpolation-b12b99f2e053?source=post_page---author_recirc--5cc0b5eae175----3---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) Feb 8, 2024 [A clap icon 708A response icon 4](https://medium.com/thedeephub/the-art-and-science-of-interpolation-b12b99f2e053?source=post_page---author_recirc--5cc0b5eae175----3---------------------5191bc43_49e7_4e9a_8fd7_b819e692fd5d--------------) [See all from Jorgecardete](https://medium.com/@jorgecardete?source=post_page---author_recirc--5cc0b5eae175---------------------------------------) [See all from The Deep Hub](https://medium.com/thedeephub?source=post_page---author_recirc--5cc0b5eae175---------------------------------------) ## Recommended from Medium ![Vision Transformers Explained (Part 2): How Patches Become Tokens](https://miro.medium.com/v2/resize:fit:679/format:webp/0*bt0P3uiuYtvzylmV.png) [![Naveen Pandey](https://miro.medium.com/v2/resize:fill:20:20/1*F7dEp7JT_lD0Pt_muNNqUQ.jpeg)](https://medium.com/@naveenpandey2706?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Naveen Pandey](https://medium.com/@naveenpandey2706?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Vision Transformers Explained (Part 2): How Patches Become TokensIn Part 1, we learned how Vision Transformers (ViTs) treat images as sequences of patches, similar to how transformers process words in a…](https://medium.com/@naveenpandey2706/vision-transformers-explained-part-2-how-patches-become-tokens-b5f02cc3578c?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Nov 16, 2025 [A clap icon 11](https://medium.com/@naveenpandey2706/vision-transformers-explained-part-2-how-patches-become-tokens-b5f02cc3578c?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) ![CNN architecture🏢 complexity explained💡💡💡](https://miro.medium.com/v2/resize:fit:679/format:webp/1*3x-htEr-cus7SHjWemlo1Q.jpeg) [![Dhana](https://miro.medium.com/v2/resize:fill:20:20/1*zHBugwEctxXnO5OQCqyMvQ.jpeg)](https://medium.com/@dhanakala2403?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Dhana](https://medium.com/@dhanakala2403?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [CNN architecture🏢 complexity explained💡💡💡Unravelling CNN🧩🧩🧩 complexity. So how does it work?](https://medium.com/@dhanakala2403/cnn-architecture-complexity-explained-9f23fb2490ce?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Oct 21, 2025 ![The AI Bubble Is About To Burst, But The Next Bubble Is Already Growing](https://miro.medium.com/v2/resize:fit:679/format:webp/0*jQ7Z0Y2Rw8kblsEX) [![Will Lockett](https://miro.medium.com/v2/resize:fill:20:20/1*V0qWMQ8V5_NaF9yUoHAdyg.jpeg)](https://medium.com/@wlockett?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Will Lockett](https://medium.com/@wlockett?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [The AI Bubble Is About To Burst, But The Next Bubble Is Already GrowingTechbros are preparing their latest bandwagon.](https://medium.com/@wlockett/the-ai-bubble-is-about-to-burst-but-the-next-bubble-is-already-growing-383c0c0c7ede?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Sep 14, 2025 [A clap icon 21KA response icon 880](https://medium.com/@wlockett/the-ai-bubble-is-about-to-burst-but-the-next-bubble-is-already-growing-383c0c0c7ede?source=post_page---read_next_recirc--5cc0b5eae175----0---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) ![The Best AI Tools for 2026](https://miro.medium.com/v2/resize:fit:679/format:webp/1*E2E2uDu07jviwuoyQDcOlg.png) [![Artificial Corner](https://miro.medium.com/v2/resize:fill:20:20/1*e1-WDgc0KCMKp_rHX9TyQQ.png)](https://medium.com/artificial-corner?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) In [Artificial Corner](https://medium.com/artificial-corner?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) by [The PyCoach](https://medium.com/@frank-andrade?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [The Best AI Tools for 2026If you’re going to learn a new AI tool, make sure it’s one of these](https://medium.com/artificial-corner/the-best-ai-tools-for-2026-933535a44f8b?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Dec 1, 2025 [A clap icon 3.9KA response icon 210](https://medium.com/artificial-corner/the-best-ai-tools-for-2026-933535a44f8b?source=post_page---read_next_recirc--5cc0b5eae175----1---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) ![Stanford Just Killed Prompt Engineering With 8 Words (And I Can’t Believe It Worked)](https://miro.medium.com/v2/resize:fit:679/format:webp/1*va3sFwIm26snbj5ly9ZsgA.jpeg) [![Generative AI](https://miro.medium.com/v2/resize:fill:20:20/1*M4RBhIRaSSZB7lXfrGlatA.png)](https://medium.com/generative-ai?source=post_page---read_next_recirc--5cc0b5eae175----2---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) In [Generative AI](https://medium.com/generative-ai?source=post_page---read_next_recirc--5cc0b5eae175----2---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) by [Adham Khaled](https://medium.com/@adhamhidawy?source=post_page---read_next_recirc--5cc0b5eae175----2---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Stanford Just Killed Prompt Engineering With 8 Words (And I Can’t Believe It Worked)ChatGPT keeps giving you the same boring response? This new technique unlocks 2× more creativity from ANY AI model — no training required…](https://medium.com/generative-ai/stanford-just-killed-prompt-engineering-with-8-words-and-i-cant-believe-it-worked-8349d6524d2b?source=post_page---read_next_recirc--5cc0b5eae175----2---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Oct 20, 2025 [A clap icon 22KA response icon 577](https://medium.com/generative-ai/stanford-just-killed-prompt-engineering-with-8-words-and-i-cant-believe-it-worked-8349d6524d2b?source=post_page---read_next_recirc--5cc0b5eae175----2---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) ![The Complete Guide to YOLO: Real-Time Object Detection Explained](https://miro.medium.com/v2/resize:fit:679/format:webp/1*knPffFfsOiDcpn2l9emhvw.png) [![Gaurav Garkoti](https://miro.medium.com/v2/resize:fill:20:20/1*cQinp3IuveCW7ehgHyUevw.jpeg)](https://medium.com/@ggarkoti02?source=post_page---read_next_recirc--5cc0b5eae175----3---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [Gaurav Garkoti](https://medium.com/@ggarkoti02?source=post_page---read_next_recirc--5cc0b5eae175----3---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [The Complete Guide to YOLO: Real-Time Object Detection ExplainedYOLO (You Only Look Once) has revolutionized the field of real-time object detection by introducing a remarkably fast and efficient…](https://medium.com/@ggarkoti02/the-complete-guide-to-yolo-real-time-object-detection-explained-3ee824c0ec2d?source=post_page---read_next_recirc--5cc0b5eae175----3---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) Dec 19, 2025 [A clap icon 7](https://medium.com/@ggarkoti02/the-complete-guide-to-yolo-real-time-object-detection-explained-3ee824c0ec2d?source=post_page---read_next_recirc--5cc0b5eae175----3---------------------3661ae5d_511a_4a12_baf9_f9a5412780a2--------------) [See more recommendations](https://medium.com/?source=post_page---read_next_recirc--5cc0b5eae175---------------------------------------) [Help](https://help.medium.com/hc/en-us?source=post_page-----5cc0b5eae175---------------------------------------) [Status](https://status.medium.com/?source=post_page-----5cc0b5eae175---------------------------------------) [About](https://medium.com/about?autoplay=1&source=post_page-----5cc0b5eae175---------------------------------------) [Careers](https://medium.com/jobs-at-medium/work-at-medium-959d1a85284e?source=post_page-----5cc0b5eae175---------------------------------------) [Press](mailto:pressinquiries@medium.com) [Blog](https://blog.medium.com/?source=post_page-----5cc0b5eae175---------------------------------------) [Privacy](https://policy.medium.com/medium-privacy-policy-f03bf92035c9?source=post_page-----5cc0b5eae175---------------------------------------) [Rules](https://policy.medium.com/medium-rules-30e5502c4eb4?source=post_page-----5cc0b5eae175---------------------------------------) [Terms](https://policy.medium.com/medium-terms-of-service-9db0094a1e0f?source=post_page-----5cc0b5eae175---------------------------------------) [Text to speech](https://speechify.com/medium?source=post_page-----5cc0b5eae175---------------------------------------)
Readable Markdownnull
Shard77 (laksa)
Root Hash13179037029838926277
Unparsed URLcom,medium!/thedeephub/convolutional-neural-networks-a-comprehensive-guide-5cc0b5eae175 s443