About the guide

Welcome to Team From Above’s repository of work done during the JournalismAI 2021 Collab Challenge in the Americas.

#Background

For this year’s challenge, participants from various newsrooms across the Americas came together to work with the Knight Lab team at Northwestern University exploring how we might use AI technologies to innovate newsgathering and investigative reporting techniques. Team members

Gibran Mena (DataCritica)[https://datacritica.org/]
Shreya Vaidyanathan (Bloomberg News)[https://www.bloomberg.com/]
David Ingold (Bloomberg News)
Flor Coehlo (LaNacion)
María Teresa Ronderos (CLIP)

Preparation

First, we start by loading the libraries (packages) we need for this task.

# load required libraries (Note: if these packages are not installed, then install them first and then load)
 
library(sf)
library(stars)
library(ggplot2)
library(dplyr)
library(tidyr)
library(mapview)
library(caret)
library(forcats)
library(RStoolbox)

Reading the data in

Then we read in and explore the satellite data matching our Area of Interest, previously downloaded from services suchs as Planet’s NICFi or Sentinel data. Encoded in a satellite image, there is reflectance information encoded in blue, green, read and near infra red spectrums of an image such as the ones from the Planet program. Load the satellite image, in this case Planet’s analytical tif file with four bands: G,R,B and Near Infrared (NIR)

Understanding the layers of raster images

satellite_data <- read_stars("input/colombia2_analytic.tif", proxy = T, resample = "cubic_spline")
plot(satellite_data)

As you can see, there are five available bands for this image, which are already loaded into a stack (several images with exactly matching area and resolution, although different information) This different information layers will constitute the data that will feed the learning algorithm, once we match it with ourlabelling data.

We could read in several layers one by one and then stacking them together, but we did for this tutorial in the first example. To get to know better our information, we can read resolution, number and names of layers.

st_crs(satellite_data)
## Coordinate Reference System:
##   User input: WGS 84 / Pseudo-Mercator 
##   wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
##     BASEGEOGCRS["WGS 84",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["Popular Visualisation Pseudo-Mercator",
##         METHOD["Popular Visualisation Pseudo Mercator",
##             ID["EPSG",1024]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["False easting",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Web mapping and visualisation."],
##         AREA["World between 85.06°S and 85.06°N."],
##         BBOX[-85.06,-180,85.06,180]],
##     ID["EPSG",3857]]
st_dimensions(satellite_data)
##      from   to   offset    delta                   refsys point values x/y
## x       1 4096 -8238077  4.77731 WGS 84 / Pseudo-Mercator FALSE   NULL [x]
## y       1 4096   371790 -4.77731 WGS 84 / Pseudo-Mercator FALSE   NULL [y]
## band    1    5       NA       NA                       NA    NA   NULL

But first, we can generate additional information, such as the normalized Vegetation Index (NDVI), usually brought up by land classification tasks. To find out what spectral band of an analytic image (as opposed to visual images downloaded from Planet) is represented by each layer, we can visit for this image https://www.planet.com/products/satellite-imagery/files/1610.06_Spec%20Sheet_Combined_Imagery_Product_Letter_ENGv1.pdf. We learn that the bands for our image are BGR NIR

#Plotting true and false color composites 

par(mfrow = c(1, 2))

raster::plotRGB(as(satellite_data, "Raster"), r = 3, g = 2, b = 1, axes = TRUE, stretch = "lin", main = "True Color Composite of AOI")
raster::plotRGB(as(satellite_data, "Raster"), r = 4, g = 3, b = 2, axes = TRUE, stretch = "lin", main = "False Color Composite of AOI")