Items¶
API to fetch and process the data.
[1]:
from arche.readers.items import *
From Cloud¶
[2]:
job_items = JobItems(key="381798/1/3")
[3]:
job_items.df.head()
[3]:
| title | price | category | description | |
|---|---|---|---|---|
| https://app.scrapinghub.com/p/381798/1/3/item/0 | It's Only the Himalayas | £45.17 | Travel | “Wherever you go, whatever you do, just . . . ... |
| https://app.scrapinghub.com/p/381798/1/3/item/1 | Libertarianism for Beginners | £51.33 | Politics | Libertarianism isn't about winning elections; ... |
| https://app.scrapinghub.com/p/381798/1/3/item/2 | Mesaerion: The Best Science Fiction Stories 18... | £37.59 | Science Fiction | Andrew Barger, award-winning author and engine... |
| https://app.scrapinghub.com/p/381798/1/3/item/3 | Olio | £23.88 | Poetry | Part fact, part fiction, Tyehimba Jess's much ... |
| https://app.scrapinghub.com/p/381798/1/3/item/4 | Our Band Could Be Your Life: Scenes from the A... | £57.25 | Music | This is the never-before-told story of the mus... |
From DataFrame¶
You can also create items from a pandas dataframe, meaning you can use its wonderful DataFrame API.
Note: raw items data can be different from pandas, especially around NAN and integer values - see https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#support-for-integer-na
[4]:
items = Items.from_df(pd.read_csv("https://raw.githubusercontent.com/scrapinghub/arche/master/docs/source/nbs/data/items_books_1.csv"))
[5]:
items.df.head(5)
[5]:
| _type | category | description | price | title | |
|---|---|---|---|---|---|
| 0 | dict | Travel | “Wherever you go, whatever you do, just . . . ... | £45.17 | It's Only the Himalayas |
| 1 | dict | Politics | Libertarianism isn't about winning elections; ... | £51.33 | Libertarianism for Beginners |
| 2 | dict | Science Fiction | Andrew Barger, award-winning author and engine... | £37.59 | Mesaerion: The Best Science Fiction Stories 18... |
| 3 | dict | Poetry | Part fact, part fiction, Tyehimba Jess's much ... | £23.88 | Olio |
| 4 | dict | Music | This is the never-before-told story of the mus... | £57.25 | Our Band Could Be Your Life: Scenes from the A... |
From Iterable¶
As an alternative, an items iterable can be passed in
[6]:
??Items.from_array
[7]:
items = Items.from_array([{"_key": "0", "title": "Universe"}])
[8]:
items.raw
[8]:
[{'_key': '0', 'title': 'Universe'}]
[9]:
items.df
[9]:
| _key | title | |
|---|---|---|
| 0 | 0 | Universe |
[ ]: