Items

API to fetch and process the data.

[1]:
from arche.readers.items import *

From Cloud

[2]:
job_items = JobItems(key="381798/1/3")


[3]:
job_items.df.head()
[3]:
title price category description
https://app.scrapinghub.com/p/381798/1/3/item/0 It's Only the Himalayas £45.17 Travel “Wherever you go, whatever you do, just . . . ...
https://app.scrapinghub.com/p/381798/1/3/item/1 Libertarianism for Beginners £51.33 Politics Libertarianism isn't about winning elections; ...
https://app.scrapinghub.com/p/381798/1/3/item/2 Mesaerion: The Best Science Fiction Stories 18... £37.59 Science Fiction Andrew Barger, award-winning author and engine...
https://app.scrapinghub.com/p/381798/1/3/item/3 Olio £23.88 Poetry Part fact, part fiction, Tyehimba Jess's much ...
https://app.scrapinghub.com/p/381798/1/3/item/4 Our Band Could Be Your Life: Scenes from the A... £57.25 Music This is the never-before-told story of the mus...

From DataFrame

You can also create items from a pandas dataframe, meaning you can use its wonderful DataFrame API.

Note: raw items data can be different from pandas, especially around NAN and integer values - see https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#support-for-integer-na

[4]:
items = Items.from_df(pd.read_csv("https://raw.githubusercontent.com/scrapinghub/arche/master/docs/source/nbs/data/items_books_1.csv"))

[5]:
items.df.head(5)
[5]:
_type category description price title
0 dict Travel “Wherever you go, whatever you do, just . . . ... £45.17 It's Only the Himalayas
1 dict Politics Libertarianism isn't about winning elections; ... £51.33 Libertarianism for Beginners
2 dict Science Fiction Andrew Barger, award-winning author and engine... £37.59 Mesaerion: The Best Science Fiction Stories 18...
3 dict Poetry Part fact, part fiction, Tyehimba Jess's much ... £23.88 Olio
4 dict Music This is the never-before-told story of the mus... £57.25 Our Band Could Be Your Life: Scenes from the A...

From Iterable

As an alternative, an items iterable can be passed in

[6]:
??Items.from_array
[7]:
items = Items.from_array([{"_key": "0", "title": "Universe"}])
[8]:
items.raw
[8]:
[{'_key': '0', 'title': 'Universe'}]
[9]:
items.df
[9]:
_key title
0 0 Universe
[ ]: