# datashakereviewsapi: python API-wrapper for DATASHAKE reviews
Python API-wrapper for DATASHAKE reviews API (https://www.datashake.com/review-scraper-api)
This module makes it easier to schedule jobs and fetch the results
Official web API documentation: https://api.datashake.com/#reviews
You need to have datashake API key to use this module
## Installation
```sh
pip install datashakereviewsapi
```
## Usage examples
Initiate API instance
```sh
from datashakereviewsapi import DatashakeReviewAPI
# Initiate API instance with your API key from DATASHAKE
api = DatashakeReviewAPI('your_datashake_reviews_scraper_api_key')
```
Schedule a single job with a URL to review page.
DATASHALE API takes several hours to crawl the page and collect the results.
```sh
response = api.schedule_job('https://uk.trustpilot.com/review/store.playstation.com')
# save job_id for querying the results later
first_job_id = response['job_id']
```
Get the job results - reviews
```sh
reviews = api.get_job_reviews(first_job_id)
```
Schedule another job with a reference to the first one - get delta (new reviews) only
```sh
response2 = api.schedule_job('https://uk.trustpilot.com/review/store.playstation.com',
previous_job_id=first_job_id)
```
Create a job list (one row in the example) and schedule jobs for all the urls from the list
```sh
jobs_list = pd.DataFrame(columns=['Website', 'url', 'latest_job_id', 'status', 'last_crawl',
'latest_schedule_message'])
jobs_list['url'] = ['https://uk.trustpilot.com/review/store.playstation.com']
updated_job_list = api.schedule_job_list(jobs_list)
```
And ultimately - fetch the reviews and save them to a csv file, reschedule all jobs in the jobs list
```sh
# Plug-n-Play block to schedule/update jobs and get/save results
# The prerequisite for running the snippet is existence of two CSV files with the following structure:
# jobs_list.csv columns: ['Website', 'url', 'latest_job_id', 'status', 'last_crawl', 'latest_schedule_message']
# reviews_list.csv columns: ['job_id', 'source_name', 'id', 'name', 'date', 'rating_value',
# 'review_text', 'url', 'profile_picture', 'location', 'review_title',
# 'verified_order', 'reviewer_title', 'language_code', 'meta_data']
# Code block refresh review jobs and review results
jobs_list_filepath = 'job_list.csv'
reviews_list_filepath = 'reviews_list.csv'
df_jobs = pd.read_csv(jobs_list_filepath, index_col='id')
df_reviews = pd.read_csv(reviews_list_filepath, index_col='unique_id')
df_jobs_new, df_reviews_new = api.get_job_list_reviews(df_jobs, df_reviews)
df_jobs_new.to_csv(jobs_list_filepath, encoding='utf-8-sig')
df_reviews_new.to_csv(reviews_list_filepath, encoding='utf-8-sig')
# Codes block to reschedule review jobs
df_jobs = pd.read_csv(jobs_list_filepath, index_col='id')
df_jobs_new = api.schedule_job_list(df_jobs)
df_jobs_new.to_csv(jobs_list_filepath, encoding='utf-8-sig')
```