class: center, middle, inverse, title-slide # Using R to Orchestrate APIs ## Research Computing Symposium 2017 ### John Little ### 2017-02-01 --- exclude: true class: center, middle background-image: url(https://d1avok0lzls2w.cloudfront.net/img_uploads/apis-for-marketers.png) --- class: bottom background-image: url(https://d1avok0lzls2w.cloudfront.net/img_uploads/apis-for-marketers.png) ## Using R to Orchestrate APIs A presentation for [Research Data at the /Edge](http://library.duke.edu/edge/events/rc17) Hosted by the [Data & Visualization Services](http://library.duke.edu/data/) Department Day One, [Duke Research Computing Symposium](https://rc.duke.edu/symposium-2017/), 2017 ??? Image credit: [Moz](https://moz.com/blog/apis-for-datadriven-marketers) --- class: bottom ### Presentation & Supporting Files - Github Repo -- https://github.com/libjohn/r-api-json - Web Site -- https://libjohn.github.io/rcs2017/ - [Slides](https://libjohn.github.com/rcs2017/slides.html) - [Demonstration](http://libjohn.github.io/rcs2017/demonstration.nb.html) - [Hands-on](http://libjohn.github.io/rcs2017/handson.nb.html) ### Eat Your Own Dogfood This presentation consists of an R Notebook and slides composed in *Rmarkdown* via *Rstudio*, slides made with `devtools::install_github("yihui/xaringan")`, files stored in a *Github Repository*, Slides & Notebook served via *Github Pages*. ??? WARNING: Mixing Fictions and Metaphors --- ## Outline - API - JSON - Orchestration: Demonstration in RStudio - Hand's On (*You can use [OIT's RStudio Docker Container](https://vm-manage.oit.duke.edu/containers/rstudio)*) --- ## Why APIs? ### The Web has lots of stuff - frontier beyond curated datasets - stuff is wrapped in HTML - HTML is transported over HTTP but composed for h2m consumption ??? To get BULK Data! -- ## Intellectual Property rights bear serious consideration ??? Check with the Library's Office of Copyright and Scholarly Communications --- ## API ### Application Program Interface - Built for machine-to-machine interactions - Instructions for programs ![API schematic](images/api.png) --- ### Client / Server ![](images/Client-server-model.svg.png) - Make [R] interface with the web - Same as h2m but now m2m --- ### h2m Simulation... - Person enters a URL ![Parts of URL](images/URL.PNG) -- - Client & server negotiate handshake (*dramatization...*) -- .right[![dramatization: good handshake](images/good-handshake.gif)] --- - Web Browser parses the HTML -- .right[![happy parsing dance](images/result-happyDance.gif)] ??? Ever seen HTML before? --- - Information is sent back in wrapped HTML ```html <!DOCTYPE html> <html> <!-- created 2010-01-01 --> <head> <title>sample</title> </head> <body> <p>Voluptatem accusantium totam rem aperiam.</p> </body> </html> ``` --- ## JSON * [Javascript Object Notation](https://en.wikipedia.org/wiki/JSON) is a language-independent data format * Currently the most common data data format for asynchronous client/server communication format * Consists of key-value pairs ```json # from https://en.wikipedia.org/wiki/JSON { "firstName": "John", "lastName": "Smith", "isAlive": true, "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021-3100" }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "office", "number": "646 555-4567" }, { "type": "mobile", "number": "123 456-7890" } ], "children": [], "spouse": null } ``` --- ## m2m -- development - Make [R] interface with the web - Same as h2m but now m2m *dramatization...* -- .right[![dramatization: confused about the protocol](images/development-confusion.gif)] --- ## Next - Demonstration - using an API to http://omdb.org (OMDB is like IMDB.com) - http://libjohn.github.io/rcs2017 - Hands-on - using An API Of Ice And Fire -- http://anapioficeandfire.com/ --- ## R Packages -- Related *People who use JSONlite also use...* * [httR](https://cran.r-project.org/web/packages/httr/) -- calls JSONlite in service to major goal of orchestrating HTTP (web scraping) * [rvest](https://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/) -- used for HTML parsing --- ## Resources - [Extracting Data from the Web Part 1](https://www.rstudio.com/resources/webinars/extracting-data-from-the-web-part-1/). RStudio Webinar Video - [JSONlite package](https://cran.r-project.org/web/packages/jsonlite/index.html) - Movies of 1976 - [OMDB Top Movies](http://www.omdb.org/encyclopedia/year/1976/statistics) - [IMDB Most Popular](http://www.imdb.com/year/1976/) - http://www.omdbapi.com/ - http://anapioficeandfire.com/ --- ## Image Credits - [API schematic](https://moz.com/blog/apis-for-datadriven-marketers) - [Client / Server](https://commons.wikimedia.org/wiki/File:Client-server-model.svg) - [URL](https://commons.wikimedia.org/wiki/File:Uniform_Resource_Locator_%28URL%29_example.PNG) - [Good human handshake](http://giphy.com/gifs/thomas-U2XboRuN89Idi) - [happy parsed dance](http://giphy.com/gifs/80s-1980s-thomas-dolby-wCKmBd7oNtA4g) - [NASA animated GIF](http://i.giphy.com/l2Jht4lIfEQfJ3zj2.gif) --- ## Shareable under CC BY-NC-SA license Data, presentation, and handouts are shareable under [CC BY-NC-SA license](https://creativecommons.org/licenses/by-nc-sa/4.0/) ![This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.](https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png "This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License")