Density plots and box plots

What should be analysed?

  • Density plot, histogram, violin plots

    • Mean value or typical value
    • Symmetry
    • Variation
    • Whether reminds some distribution
    • Heavy/Light tailed
    • One ore more modes
    • Skewness

Density plots and box plots

What should be analysed?

  • Box plot

    • Median
    • Variation
    • Outliers
    • Symmetry
    • Quantiles

Density plots and box plots

Example: Visualizing miles per gallon depending on transmission type

Scatter plot

  • Y: dependent variable, X: independent variable
  • Smoother is a good idea to have

Analysis:

  • Shape (data=true+error, true=linear, quadratic, cubic, exponential, .., empirical)
    • How to find the right model?
      • Fitting the data (regression)
      • Analysis of residuals or model selection methods
  • Strength (how close observations to a hypothized model)
    • If linear, Correlation r or coefficient of determination R2

Scatter plot

Analysis:

  • Direction (if monotonic, decreasing or increasing; if not monotonic, which parts increasing, which decreasing)
  • Density (dense areas, sparse areas)
  • Outliers
  • Clusters

Scatter plot

Example: Visualizing weight and rear axle ratio

Scatter plot

  • More variables can be mapped
    • Mark shape
    • Mark size
    • Mark color
    • Mark orientation
    • Juxtaposed displays or superimposed displays
  • If juxtaposed displays used, we get
    scatterplot matrix

SPLOM

3D Surface plots and contour plots

  • Remember: interpolated data used
  • Analysis:
    • Peaks and draughts
    • Trends
    • Additivity
    • Always check the underlying data after

3D Surface plots and contour plots

Geospatial data

  • Geographical coordinates are involved
  • Used in many applications
    • Climate modeling/analysis
    • Economic/social data analysis
    • Transaction data

Spatial phenomena

  • Point phenomena ( ex: building location, city location)
  • Line phenomena (paths, roads)
  • Area phenomena (counties)
  • Surface phenomenon (mountains)

Types of maps

  • Symbol/dot maps (nominal/ordinal point data)

  • Land use maps/Choropleth maps (nominal/ordinal area data)

  • Line diagrams (nominal/ordinal line data)

  • Isoline maps (ordinal surface data)

  • Surface maps (ordinal volume data)

  • Note: Different maps can be used for the same data

    • Choropleth map / Dot map
    • Density surface /dot map

What is map?

  • Map coordinates:

    • longitude \(\lambda=[-180,180]\), negative=west
    • lattitude \(\phi=[-90,90]\), negative= south
  • Challenge: \([\lambda, \phi] \rightarrow [x,y]\)

  • Different map projections

    • Conformal projection: retains angles (shapes) but not area
    • Equal area: retains areas but not angles (shapes)

What is map?

  • Cylindrical projection, plane projection and cone projection
  • Cylindrical projection used by Google, standard now

Cylindrical projection

  • Conformal projection: far northern/far southern areas inflated
  • Defined by \(x=\lambda, y=\phi\)

Cone projection

  • Albers Equal-area projection
    • Preserves areas
    • Shapes or distances are not correct

Visual variables for spatial data


Symbol/Dot maps

  • Data= Lattitude, Longitude+ Other variables
  • Latt, Long->Coord, Other variables–>Visual aesthetics
    • Amount is limited! (perception problems)
  • Another approach: multiple parameters on multiple maps

Symbol/dot maps

  • Analysis:
    • Density in geogr areas and between geogr areas
    • Spatial pattern of density (north, south)
    • Clusters, outliers
  • Problems:
    • Overplotting in highly populated ares
    • If several observations have the same coordinate
    • Size aesthetics used–> perception problem
      • Perceived size depends on local neighborhood (Ebbinghaus illusion)
    • Color used: color perception problems

Symbol/dot maps

  • Problems:
    • Absolute vs relative mapping (proportional to population)

Line diagrams

  • Observation: set of (Latt, Long) pairs+ other variables
  • Often: start, end point

Line diagrams:

  • Same as in network analysis plus
  • geographical relationships between links and their density (size)
    • Where dense links located?
    • How links are directed?
  • Problems:
    • Overplotting
    • If line length analysed -> length perception problem
    • If width analysed -> volume perception problem
    • Colors analysed ->color perception problem

Line diagrams

  • Overplotting - possible solution:
    • Using curved lines, minimize edge crossing

Visualizing area data

  • Data: Name/Coordinates of geographic area+ other variables
  • Choropleth maps: variables=color or shaded region on map

Choropleth maps

  • Analysis:
    • Find clusters of regions that are similar
    • Find unusual regions (compared to neighbor regions)
    • Find patterns on the map
  • Problems affecting perception:
    • Color/grayscale mapping
    • Choice of regions (county, state,…)
    • Larger region with the same color looks dominating
    • Patterns in small/densely populated areas hard to see

Choropleth maps



Choropleth maps

Visualizing area data

  • Isarithmic maps: show areas of phenomenon on the map (density)
    • Contour map
    • Topographic map

Software for geospatial visualization

  • Plenty of commercial/Noncommercial software
    • ArcGIS, Google, Yahoo, Microsoft map API
  • Plotly
    • plot_geo()
    • Using MapBox
  • To use Mapbox:
    • Register with your email, find your token
    • Run in R Sys.setenv('MAPBOX_TOKEN' = 'your_mapbox_token_here')
  • Ggplot2
    • geom_sf()
    • ggmap

Using maps

  • A few countries available through plotly
  • Downloading map of a country:
    • Finding a country map http://gadm.org/
    • Decide what level of detalization is needed (region, county,…)
    • Download GeoGSON file, unzip if needed.
    • Read the JSON file into R
    • Use with ggplot()+geom_sf()
    • Use with Plotly: plot_geo()+add_trace(type=“choropleth”/ “scattergeo”)
    • Use with MapBox: plot_mapbox()+add_trace(type=“choroplethmapbox”/ “scattermapbox”)

Finding locations

Read home

  • Chapter 6
  • Plotly book, ch 2.2, 2.4 and 2.5