4. Look

I look up to the sky. “4. Look” is published by Ayyub.

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




How to classify deep space radio signals using deep learning

In this project we use 2D Spectrograms of deep-space radio signals collected by the antennas at the SETI(The search for extraterrestrial intelligence) institute.

What they do is they use their Allen telescope array & wide range of antennas to scan the night sky for very faint radio signals, from the outer space. What we will do is we’ll treat these 2-D Spectrograms as images and put them in our image classifier to classify these faint radio signals into one of the four categories.

To give you a bit more context, at the SETI institute they use the Allen telescope array situated in northern California to scan the sky at various radio frequencies to observe the star system with known exoplanets.

The goal is to search for faint but persistent signals.

The current signal detection system is programmed for only particular kinds of signals such as narrow-band carrier waves however, the detection system sometimes triggers the signals that are not narrow-band signals with some unknown efficiency and are also not explicitly known frequency interference. So there seem to be various categories of these kinds of events that have been observed in the recent past so our goal for the next few minutes is to build an image classification model to classify these signals accurately in real-time so this may allow the signal detection system to make better observational decisions and thereby increasing the efficiency of the night scans, allowing for explicit detection of signal types.

I’m expecting that you have a prior knowledge about coding in Python and have a basic idea about how a neural network performs under the hood especially convolutional neural network as I won’t be getting in the maths.

We’ll begin with importing the libraries.
We’ll be using tensorflow v 2.2.0.

We use pandas to read the CSV file where the images are stored.
But how can images be stored in a CSV file ?

The spectrograph images were converted into their raw pixel intensity values and were normalized so the values lie between 0 and 1. They are then converted into an array by stretching them. Therefore each row of the CSV file corresponds to a single image.

Each Row Corresponds to an image.

The label were found to be one hot encoded in to a vector of 1,4(no. of classes).

We read the files into our pandas DataFrame and specify that we don’t require a header as we have no header in the CSV file.

We then visualize the DataFrames using the .head() method.

and see the following output:

Output for image DF
Output for the Labels DF

We then try to see the shape of the DataFrames.

and we find that the output:

8192 is nothing but 64*128, therefore we’ll reshape them into 64*128(representing the width and height of the image).

Since all of the images were converted into 2-D spectrograms we didn’t have information about RGB channels therefore we put the 1 in the last dimension or else we would have put a 3 had it been coloured or colour had an importance here.

Also, we converted the data frames into the array format(.values) because our neural network can’t take in the DataFrame format and our image classifier wants the data to be in that specific format which was achieved after reshaping.

Let’s try to visualize the images using matplotlib:

Looks like all three of the randomly selected images are of the type Narrow-band.

You can re-run the function to see a different set of images.

As the data was already processed and normalized, we don’t need to do a lot of pre-processing.

Let me give you a brief introduction to CNN if you’re a bit rusty of the knowledge on that. So, CNNs are just a type of feed-forward neural network consisting of multiple layers of neurons that have learnable weights and biases, so each neuron in a layer receives an input from a proceeding layer processes the input and optionally follows it with non-linearity. So that your model not just learns a linear sum of the data but also complex non-linear functions.
The network for CNN has multiple layers such as convolution layers, maxpool layers for down-sampling, dropout for regularization, and all these layers followed by one or more fully-connected layers at the end. So at each layer, a small neuron process portion of input images and output of these collections are then tiled so that the input region overlap to obtain a high-resolution representation of the input image, and this process is repeated for every such layer.

The bottom line is that CNN takes a complex pattern in images and breaks them down to simple patterns through multiple hierarchical layers.
Max-pooling is simply a non-linear downsampler, it partitions the input image into a set of rectangles and then finds the max value of that region.

We’re gonna use Keras to implement our CNN. :)
We’ll first begin by importing all the util files from keras and tf:

Here’s the implementation of the model in Keras:

Before we compile the model let’s define a learning rate scheduler.
Use of a scheduler is to decay the learning rate after some ‘time’ .

This will keep mitigating the learning rate like 0.005*(0.96**5) = 0.004076 after each of the specified step.

Here’s the summary of the model:

Before we start training our model, we need to define some callbacks if we are interested in saving our model at certain checkpoints or on certain optimizations to have the least validation loss.

History object is the place we need to pass in our data generator that we created in the previous step.

The model train for about 10 mins on Google Colab without GPU or TPU.

Let’s now evaluate our model

Its accuracy is around 74% which was considered a benchmark back in 2017 when this dataset was launched in a SETI hackathon.

Let’s build a confusion matrix to analyse it better:

Add a comment

Related posts:

An Ordinary Chapter Of Myself

I was born in Kampong Cham province, one of the central parts of the country, but my family decided to move to the capital city to pursue better life opportunities. I am currently majoring in media…