Items¶
API to fetch and process the data.
[1]:
from arche.readers.items import *
From Cloud¶
[2]:
job_items = JobItems(key="381798/1/3")
[3]:
job_items.df.head()
[3]:
title | price | category | description | |
---|---|---|---|---|
https://app.scrapinghub.com/p/381798/1/3/item/0 | It's Only the Himalayas | £45.17 | Travel | “Wherever you go, whatever you do, just . . . ... |
https://app.scrapinghub.com/p/381798/1/3/item/1 | Libertarianism for Beginners | £51.33 | Politics | Libertarianism isn't about winning elections; ... |
https://app.scrapinghub.com/p/381798/1/3/item/2 | Mesaerion: The Best Science Fiction Stories 18... | £37.59 | Science Fiction | Andrew Barger, award-winning author and engine... |
https://app.scrapinghub.com/p/381798/1/3/item/3 | Olio | £23.88 | Poetry | Part fact, part fiction, Tyehimba Jess's much ... |
https://app.scrapinghub.com/p/381798/1/3/item/4 | Our Band Could Be Your Life: Scenes from the A... | £57.25 | Music | This is the never-before-told story of the mus... |
From DataFrame¶
You can also create items from a pandas dataframe, meaning you can use its wonderful DataFrame API.
Note: raw items data can be different from pandas, especially around NAN
and integer values - see https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#support-for-integer-na
[4]:
items = Items.from_df(pd.read_csv("https://raw.githubusercontent.com/scrapinghub/arche/master/docs/source/nbs/data/items_books_1.csv"))
[5]:
items.df.head(5)
[5]:
_type | category | description | price | title | |
---|---|---|---|---|---|
0 | dict | Travel | “Wherever you go, whatever you do, just . . . ... | £45.17 | It's Only the Himalayas |
1 | dict | Politics | Libertarianism isn't about winning elections; ... | £51.33 | Libertarianism for Beginners |
2 | dict | Science Fiction | Andrew Barger, award-winning author and engine... | £37.59 | Mesaerion: The Best Science Fiction Stories 18... |
3 | dict | Poetry | Part fact, part fiction, Tyehimba Jess's much ... | £23.88 | Olio |
4 | dict | Music | This is the never-before-told story of the mus... | £57.25 | Our Band Could Be Your Life: Scenes from the A... |
From Iterable¶
As an alternative, an items iterable can be passed in
[6]:
??Items.from_array
[7]:
items = Items.from_array([{"_key": "0", "title": "Universe"}])
[8]:
items.raw
[8]:
[{'_key': '0', 'title': 'Universe'}]
[9]:
items.df
[9]:
_key | title | |
---|---|---|
0 | 0 | Universe |
[ ]: