Whether it's to study vegetation, assess the Earth's surface temperature, map the effects of a flood or simply to spy on North Korea, there's always a good reason to use satellite data. And that's just as well: many Earth observation programs make their data freely available.
In this tutorial, we'll look at :
How to access data from the American Landsat program
How to use this data to produce a “true color” composite image of the surface with Python
How do I download Landsat data?
Landsat data can be accessed via the Earth Explorer website. For this example, we'll be looking for recordings made over the Paris region during the summer of 2019.
The Landsat 8 satellite takes 700 images (or “scenes”) a day in several frequency bands, from ultraviolet to infrared, including visible light. It covers the entire planet with a resolution of 30 meters every 16 days... And all this data is available free of charge!
Here's how it works:
To start with, you'll need to create an account. You can explore the data without it, but you'll need it to download it at the end.
The first tab, “Search criteria”, allows you to select the area of interest. There are a number of ways to do this, the simplest of which is to click directly on the map at the location you're interested in, in our case : Paris.
A little further down in the same tab, you can select the time period you're interested in. Don't hesitate to choose a wide period: the satellite passes by about twice a month, but cloud cover can make some passes unusable. In our case, the selection runs from June 1 to August 31, 2019:
You can now go to the next tab, “Data Sets”, to choose the data you're going to download. You'll see that the choice is vast... But we're only interested in Landsat 8 OLI/TIRS data:
You can now move on to the next tab, “Additional Criteria”, in particular to automatically reject images that are too swimmable, for example with a coverage of over 20%.
Now it's time to move on to the “Results” tab, where you'll find a list of scenes that match your search. You can see the area covered by each scene by clicking on the foot icon:
If you like the scene, click on the shopping cart icon to add it to your order. Once you've made your selection, click on “Item basket” on the right above the card (you'll need to be logged in and have an account). This will take you to your order summary:
Click on “Proceed to checkout”. You will receive a confirmation e-mail and a new e-mail when the data is ready to be downloaded. Don't be impatient: the wait can last several hours...
When you receive an e-mail with the subject “USGS ESPA product request order number ************** available for download”, click on the link. On the page, click on the link under “Order ID”. You can then download your scenes one by one by clicking on “Download”:
Understanding Landsat data
Congratulations, you've downloaded your first satellite data. But what exactly did you get? Let's take a closer look.
Each scene comes in the form of a compressed file of around 400MB. You can decompress this file with Winrar or 7zip, for example.
Once unzipped, you'll find 10 .tif files and one .xml file. The xml file gives you all the information you need about the scene: coordinates, dates, etc.
TIFF files are raster images. Each of these files contains different data. There's no need to go into the details here (if you're interested, you can find them here), just know that :
band 2 (the file ending in “band2.tif”) records radiation emitted by the earth's surface at wavelengths between 450 and 510 nanometers, which our eyes would see as blue,
band 3 records radiation between 530 and 590 nanometers, which corresponds to the color green,
band 4 is located between 640 and 670 nm, i.e. red.
The sum of these three bands corresponds to the spectrum visible to the human eye. Using these three files, we'll be able to construct an image that looks natural to us, rather like what we would have seen from the satellite.
But first, let's try to understand a little better what these TIFF files are. In reality, they're not very exotic objects; you can think of them in two ways:
A .tif file is an uncompressed digital image. You can open these files with popular image processing software such as Xnview or paint. In this case, you'll see something resembling a black-and-white image, with light areas representing surfaces that emit a lot in the frequency band you've opened, and dark areas representing those that emit less.
Another way of looking at it: a .tif file is a matrix or table containing numbers. Each number corresponds to the radiation emitted in the file's frequency band by an area of land measuring 30 meters by 30 meters. The higher the number, the greater the radiation, with 0 corresponding to zero radiation.
Create a natural color image with Python
As we have seen, bands 2, 3 and 4 correspond to light as perceived by our eyes, with blue (wavelengths between 450 and 510 nanometers), green (530-590 nm) and red (640-670 nm) respectively. These data can be used to compose an image close to what the human eye would have seen had it been in the place of the Landsat 8 sensors. This is what we're going to do with Python.
In addition to the great classics, we'll be using rasterio and, above all, earthpy. Earthpy is a library developed by the University of Colorado that will greatly simplify the manipulation of our data.
import os
from glob import glob
import matplotlib.pyplot as plt
import rasterio as rio
import earthpy.spatial as es
import earthpy.plot as ep
The first step is to retrieve the list of images for the different frequency bands and sort them, so that the visible bands are in 2nd, 3rd and 4th position in the list.
#Folder where data is stored
path = os.path.join("data")
#Create list of paths to .tif files to be used
bandeliste_landsat_data_band = glob(path + "/*band*.tif")
#Ranking
liste_landsat_data_band.sort()
We're now going to “stack” the data for each frequency band into a single file. It's as easy as pie, thanks to earthpy :
#Name and location of the file to be created
stack_path = os.path.join("data", "stack.tif")
#Creating and saving the raster
es.stack(liste_landsat_data_band, stack_path)
Note that we could have taken only the 3 frequency bands we're interested in. But we may need the other bands later: now that our global file is saved, we can use it without going through this step again.
You can even delete the original data, to give the hard disk some breathing space:
#Deleting original data
for f in liste_landsat_data_band:
os.remove(f)
The next step is to open the file just created with rasterio :
#File name and location (a priori unchanged)
stack_path = os.path.join("data", "stack.tif")
#Opening
with rio.open(raster_path) as src:
raster = src.read()
The final step is to create an image from the visible strips. Here again, earthpy makes the operation incredibly simple:
ep.plot_rgb(raster,
rgb = (3, 2, 1))
plt.show()
The rgb argument corresponds to the bands to be used in the order red, green, blue. These are the 4th, 3rd and 2nd bands, but as indexing starts at zero: rgb = (3, 2, 1).
A few additional arguments may be useful:
title to add a title
figsize to set the image size
stretch = True to correct brightness
For example:
ep.plot_rgb(raster,
rgb = (3, 2, 1),
stretch = True,
figsize = (79, 80),
title = "Ile de France\n(Image composite, Landsat)")
plt.show()
And here's the job:
Obviously, on a scene measuring 185 kilometers by 180, you can't see much at first glance. You can save this image and zoom in on the points that interest you, for example :
You can probably already see the applications that these images can have in regional planning, the monitoring of major infrastructures, agriculture or forestry, and certainly in many other fields... And we've only used 3 of the 10 data files available: a lot of information is still untapped!
About: Callendar is a start-up specializing in the development of innovative solutions for climate risk assessment. Aware of the challenge of adapting to climate change, we strive to share our expertise through free tools and tutorials like this one.