Quantcast
Channel: Kodi Community Forum - Scrapers
Viewing all 171 articles
Browse latest View live

The logic and future of Music scrapers?

$
0
0
i'm currently working on python based scrapers for artists and albums.
since i'm pretty clueless on the scraping process, i'm hoping to get some feedback from more experienced devs.

so far, i came across two things i don't quite understand:


1) if the 'prefer online info' setting is disabled, we pass the artistname to the artist scraper.
if the setting is enabled, we pass the artist mbid to the scraper.

why don't we always pass the mbid (if available) regardless of this setting?

ref: https://github.com/xbmc/xbmc/blob/99c25f...#L220-L223
ref: https://github.com/xbmc/xbmc/blob/99c25f...#L569-L583


2) if the album scraper returns no results, we completely skip the artist scraper. why?

ref: https://github.com/xbmc/xbmc/blob/99c25f...#L843-L883


3) if the 'prefer online info' setting is enabled, and 'show song and album artists' is enabled:
this causes the same artist being listed twice in your library if the artistname in your tags does not 100% match the artistname the scraper returns.

for instance "The B-52's" vs. "The B-52s":
3.1) i have all songs of an album tagged with artist "The B-52's"
3.2) we start the album scanner and it returns the mbid for this artist
3.3) we pass this mbid to the artist scraper and it returns info for "The B-52s" and kodi adds it to the db.
3.4) kodi now scans all songs for 'additional' artists. it finds "The B-52's" and checks if it's already in the db... nope
3.5) we pass "The B-52's" to the artist scraper and it returns info for whatever closest match it can find and kodi adds this artist to the db

ref: https://github.com/xbmc/xbmc/blob/99c25f...#L843-L883

[split] The logic of Music scrapers?

$
0
0
ronie not sure about your problem. and you prob know this but

id3 tags are not saved in ASF, MP4 or WAV based files.

if i missed what you where asking well thats me always shoot and miss Big Grin

Javascript scrapers?

$
0
0
(2017-02-07 17:45)ronie Wrote:  i'm currently working on python based scrapers for artists and albums.
Sorry if this is the wrong place to ask, as I know that Kodi have already had an integrated Python interpreter for many years now.

But can I ask if there is any plans for Kodi to also get native support for scrapers written in JavaScript?

I see that Razze made a "Kodi addon generator" which enables you to use Node.js would could allow to use JavaScript in addons.

http://forum.kodi.tv/showthread.php?tid=305664

It would however be more convenient if Kodi had an integrated (embedded) JavaScript engine for scrapers, plugins, and addons.

Example; V8 (JavaScript engine) from Chromium Project, though it might be overkill? Or Cesanta "V7" embedded JavaScript engine

https://en.wikipedia.org/wiki/V8_(JavaScript_engine)
https://developers.google.com/v8/

Cesanta "V7" embedded JavaScript engine for C/C++, which claims to be the worlds smallest JavaScript engine written in C

https://docs.cesanta.com/v7/master/
https://github.com/cesanta/v7

V8 integration might be better if Kodi still have plans to also integrating CEF (Chromium Embedded Framework) web server?

https://www.phoronix.com/scan.php?page=n...e-Embedded

Skip CreateSearchUrl?

$
0
0
I'm on my first attempt to write a very simple scraper for an adult site but have run into a problem..

Conveniently the files you download from the site already contain an ID in their filename that can be used to point directly to the info page.

Reading this Scrap (wiki) I find that without an NFO I have to create a search URL first, but I have no need to do that since I can point straight to the correct page without searching.

Is it possible for a scraper to not have to search, for it only to provide an address ready for use in DownloadDetailsPage?

I guess this would the same as using an NFO where the XML would just return a URL ready to go, but triggered from FILENAME instead of an NFO?

Movie information scraper on all my Kodi

$
0
0
Hi,
New to this forum and would like some insight on how Movie database scraper works, right now moviedb and imdb movie information scrapes all my video library and very few addons movie list, but i would want the movie scraper to scrap all the names in my Kodi including all video addons.
Is there a way to see movie information on all video addons ?

Please help

first_air_date regex bug in metadata.tvshows.themoviedb.org version 1.3.1?

$
0
0
I think I might have stumbled upon an issue while debugging tv shows not appearing in the library. I'm using the TV show scraper from The Movie Database from http://mirrors.kodi.tv/addons/krypton/me...-1.3.1.zip

For the search function GetSearchResults it is doing a RegExp on the first_air_date of either ([0-9]+) or null. Looking at the JSON API response The Movie Database API is returning "" (blank) instead.

When searching for Lorraine returns 5 results. The first entry is "Lorraine" but the first_air_date is set to "". The kodi debug logs for GetSearchResults returned "Audience with Lorriane". Note this is the wrong show but matches first_air_date regex. My debug logs then shows kodi returns the GetVideoDetails for the id of the original "Lorraine" show. This tv show is in the kodi video library.

When searching for Unreported World returns 1 result. This result has the first_air_date is set to "". Since this is the only result, the search function GetSearchResults returns empty. The next line in the logs is the warning "No information found for item xxx, it won't be added to the library."

I've seen this behavior on multiple entries now. See FYI Daily and Location, Location, Location

Other programs where first_air_date is set are working fine. One example is The Big Bang Theory

How do I pass this information to the maintainer?

read or send movietitle when KODI start play movie

$
0
0
Hey, Guys, i plan to build a sence via Domoticz when KODI start play movie, difference movie has difference sence, can somebody tell me how to read movietitle via Domoticz or KODI can send movietitle to Domoticz,thanks.

Scraper which parses another addon's JSONRPC response

$
0
0
Hi,

I wrote a script type plugin, which builds some media informations, if I Call it with JSONRPC and pass a movie title. But I want to create a scraper which parses my script's JSONRPC response.

I think, I Can make the regex extract from the JSON, but how Can I make a request from the scraper to the script? Is it possible?

If not, is there any other way to do this?

Many thanks!

Subtitles scraper

$
0
0
Hello everyone,

I am trying to make a subtitle scraper for TV media using the Kodi code base. Does Kodi fetch the subtitles from the video streaming or should the subtitles always be provided from extern sources like opensubtitles?

Kind regards

TheaudioDB

Banned Scraper Question

$
0
0
I want to apologize to the admin on my question of a banned add-on (scraper) I mentioned. I am fairly new here...have asked some questions here and there but certainly not wanting to break the forums rule and there was no intent . Lesson learned.

Is THETVDB scraper using the new TVDB 2.0 api?

$
0
0
Ronseal.

Is it? If not, will a new version be ready in time for the October 1st shut off?

Thanks!

AMDB Arabic Movies Scraper

$
0
0
AMDB The Arabic Movie Database. is the first Arabic movies Scraper , work with Kodi (xbmc)

Style for TV Show Episodes, Local Scraper.. Help

$
0
0
Hello,

Im using EMM to edit my KODI related Media-Files. I never scrape with Online-Scrapers included in KODI.

The Sort-Style how TV-Show-Episodes are shown in Kodi are different.

1x01
1x02
1x03
...

other

01.
02.
03.
...

Which command i have to use, to make them look equal? I really dont like 1x01 Style. I checked some .nfo Files where the style is correct and compared to the .nfo files, whre the style is incorrect i couldnt see the command for it.

Please help me.

Thank you.

Scraper reading a xml file

$
0
0
Hello everyone, I would like to develop my first scraper. I would like to know if it is possible that the scraper read an xml file which you previously stored the details of the film.
I thank all those who help me.

using Kodi scrapers for personal project

$
0
0
Hi,

I would like to ask about the stance of the kodi project on a possible scraper reuse in another project (also GPL). And possibly any best practice you would recommend.
I was toying with the idea of writing my own scraper, but then saw the giant library of kodi scrapers, read how to write a scraper, and figured implementing your "engine" (which BTW I find very ingenious) would be a lot easier than writing/maintaining my own scrapers. I currently have a proof of concept working (themoviedb scraper loads and gets info for a movie Smile ) .

Of course I have no idea if it will go anywhere, but if it does, I'd like to be on good terms with you guys Wink

Thanks,

serafean.

Scraping recorded TV-Shows - extend TVDB Scraper - get function calls right

$
0
0
Already spent days to get this working: I got a local PVR to record Movies and TV shows. For recorded movie files I was already successful scraping them with the information my PVR gives me by calling its XML-API - with the TV shows I still fail and I now hope somebody can help me.

I thought the task would be simple: Just make a HTTP-Call to the PVR API, get the TVDB-ID there for the file to be scraped, and then go on with the regular TVDB-scraper using this ID. So mainly just modifying the functions "CreateSearchUrl" or maybe also "GetSearchResults".

Yet the main problem I have is calling a function in the right way to do this TVDB-ID lookup. I tried many ways - two of them are shown below.

What I was still not able to figure out is how to call a function in the right way. My main questions are:
  1. When I use a function to get some XML-file from another site - how do I trigger to really GET the content from those sites? At what point of the execution of the scraper are URLs realy evaluated? My log files suggests that this is not triggered just calling a function - they mainly just extend the URLs and make them richer with code.
  2. Do I need to enclose the results of a function with any XML-tags? Mostly all of the code sample put the results between <details> and </details> - but why? Why "details" and could I also use "url" for example. I don't understand how this is used. I just noticed that if I don't use any enclosing tags at all the function simple doesn't show up being executed in the log file.

So this is my code ...

First try - extend CreateSearchUrl to call the PVR API to get the TVDB-ID and then pass it on regularily to GetSerachResults:
Code:
<CreateSearchUrl dest="3">
    <RegExp input="$$1" output="<chain function="GetTVDBIdFromEpisode">\1</chain>" dest="5">
        <expression >(?:%20| |_)([]0-9]+)(?:\.ts|$)</expression>
    </RegExp>
    <RegExp input="$$5" output="<url>http://thetvdb.com/api/GetEpisode.php?id=\1&amp;language=$INFO[language]</url>" dest="3">
        <expression noclean="1" />
    </RegExp>
</CreateSearchUrl>

<GetTVDBIdFromEpisode dest="3">
    <RegExp input="$$4" output="<details>\1</details>" dest="3">
        <RegExp input="$$1" output="<url function="ParseTVDBIdFromEpisode">http://10.0.0.1:8081/record.onexml?id=\1</url>" dest="4">
            <expression />
        </RegExp>
        <expression noclean="1" />
    </RegExp>
</GetTVDBIdFromEpisode>

<ParseTVDBIdFromEpisode dest="5">
    <RegExp input="$$1" output="<details>\1</details>" dest="5">
        <expression><t_thetvdbid>([0-9]+)</t_thetvdbid></expression>
    </RegExp>
</ParseTVDBIdFromEpisode>

Not working - both the API call and the function ParseTVDBIdFromEpisode are working, but the result is not passed on to the top. So the value for the ID to be used in CreateSearchUrl is finally empty.

Code:
23:15:30 T:1826356112   DEBUG: std::vector<CScraperUrl> ADDON::CScraper::FindMovie(XFILE::CCurlFile&, const string&, bool): Searching for '20160830 rtl Bones - Die Knochenjaegerin 2026' using IPTV PVR TV Series Scraper scraper (path: '/storage/emulated/0/Android/data/org.xbmc.kodi/files/.kodi/addons/metadata.iptvpvr.tvdb', content: 'tvshows', version: '1.0.0')
23:15:30 T:1826356112   DEBUG: scraper: CreateSearchUrl returned <url>http://thetvdb.com/api/GetEpisode.php?id=<chain function="GetTVDBIdFromEpisode">2026</chain>&language=de</url>
23:15:30 T:1826356112   DEBUG: scraper: GetTVDBIdFromEpisode returned <details><url function="ParseTVDBIdFromEpisode">http://10.0.0.1:8081/record.onexml?id=2026</url></details>
23:15:30 T:1826356112   DEBUG: CurlFile::Open(0x6e1624b0) http://10.0.0.1:8081/record.onexml?id=2026
23:15:30 T:1826356112    INFO: void XCURL::DllLibCurlGlobal::easy_aquire(const char*, const char*, XCURL::CURL_HANDLE**, XCURL::CURLM**) - Created session to http://10.0.0.1
23:15:30 T:1826356112   DEBUG: static bool CScraperUrl::Get(const CScraperUrl::SUrlEntry&, std::string&, XFILE::CCurlFile&, const string&): Using "UTF-8" charset for XML "http://10.0.0.1:8081/record.onexml?id=2026"
23:15:30 T:1826356112   DEBUG: scraper: ParseTVDBIdFromEpisode returned <details>4818866</details>
23:15:30 T:1826356112   DEBUG: CurlFile::Open(0x6e1624b0) http://thetvdb.com/api/GetEpisode.php?id=
23:15:30 T:1826356112   DEBUG: static bool CScraperUrl::Get(const CScraperUrl::SUrlEntry&, std::string&, XFILE::CCurlFile&, const string&): Using "UTF-8" charset for XML "http://thetvdb.com/api/GetEpisode.php?id="
23:15:30 T:1826356112   DEBUG: scraper: GetSearchResults returned <?xml version="1.0" encoding="utf-8" standalone="yes"?><results></results>

Second try - extend the function GetSearchResults to make the API-call:

Code:
<CreateSearchUrl dest="3">
    <RegExp input="$$1" output="<url>http://10.0.0.1:8081/record.onexml?id=\1</url>" dest="3">
        <expression noclean="1">(?:%20| |_)([]0-9]+)(?:\.ts|$)</expression>
    </RegExp>
</CreateSearchUrl>

<GetSearchResults dest="8">
    <RegExp input="$$5" output="<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results><entity>\1</entity></results>" dest="8">
        <RegExp input="$$1" output="<title>\1</title>" dest="5">
            <expression><t_caption>([^<]*)</t_caption></expression>
        </RegExp>
        <RegExp input="$$1" output="<url cache="tt\1.xml" function="GetTVDBIdFromEpisode">http://thetvdb.com/api/GetEpisode.php?id=\1&amp;language=$INFO[language]</url>" dest="5+">            
            <expression><t_thetvdbid>([0-9]+)</t_thetvdbid></expression>
        </RegExp>
        <expression noclean="1" />
    </RegExp>
</GetSearchResults>

<GetTVDBIdFromEpisode dest="3">
    <RegExp input="$$1" output="<url cache="\1-$INFO[language].xml">http://thetvdb.com/api/1D62F2F90030C444/series/\1/all/$INFO[language].zip</url>" dest="3">
        <expression><seriesid>([0-9]+)</seriesid></expression>
    </RegExp>
</GetTVDBIdFromEpisode>

Still not working - the API function call is not done during execution of GetSearchResults. Instead the whole code for calling the function is passed on to the function "GetDetails" and there it is not working.

Code:
23:11:04 T:1825258144   DEBUG: std::vector<CScraperUrl> ADDON::CScraper::FindMovie(XFILE::CCurlFile&, const string&, bool): Searching for '20160830 rtl Bones - Die Knochenjaegerin 2026' using IPTV PVR TV Series Scraper scraper (path: '/storage/emulated/0/Android/data/org.xbmc.kodi/files/.kodi/addons/metadata.iptvpvr.tvdb', content: 'tvshows', version: '1.0.0')
23:11:04 T:1825258144   DEBUG: scraper: CreateSearchUrl returned <url>http://10.0.0.1:8081/record.onexml?id=2026</url>
23:11:04 T:1825258144   DEBUG: CurlFile::Open(0x71f0c220) http://10.0.0.1:8081/record.onexml?id=2026
23:11:04 T:1825258144    INFO: void XCURL::DllLibCurlGlobal::easy_aquire(const char*, const char*, XCURL::CURL_HANDLE**, XCURL::CURLM**) - Created session to http://10.0.0.1
23:11:04 T:1825258144   DEBUG: static bool CScraperUrl::Get(const CScraperUrl::SUrlEntry&, std::string&, XFILE::CCurlFile&, const string&): Using "UTF-8" charset for XML "http://10.0.0.1:8081/record.onexml?id=2026"
23:11:04 T:1825258144   DEBUG: scraper: GetSearchResults returned <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results><entity><title>Bones - Die Knochenj&#xE4;gerin</title><url cache="tt4818866.xml" function="GetTVDBIdFromEpisode">http://thetvdb.com/api/GetEpisode.php?id=4818866&language=de</url></entity></results>
23:11:04 T:1825258144   DEBUG: bool ADDON::CScraper::GetVideoDetails(XFILE::CCurlFile&, const CScraperUrl&, bool, CVideoInfoTag&): Reading movie 'http://thetvdb.com/api/GetEpisode.php?id=4818866&language=de' using IPTV PVR TV Series Scraper scraper (file: '/storage/emulated/0/Android/data/org.xbmc.kodi/files/.kodi/addons/metadata.iptvpvr.tvdb', content: 'tvshows', version: '1.0.0')
23:11:04 T:1825258144   DEBUG: CurlFile::Open(0x71f0c220) http://thetvdb.com/api/GetEpisode.php?id=4818866&language=de
23:11:04 T:1825258144    INFO: void XCURL::DllLibCurlGlobal::easy_aquire(const char*, const char*, XCURL::CURL_HANDLE**, XCURL::CURLM**) - Created session to http://thetvdb.com
23:11:05 T:1825258144   DEBUG: static bool CScraperUrl::Get(const CScraperUrl::SUrlEntry&, std::string&, XFILE::CCurlFile&, const string&): Using "UTF-8" charset for XML "http://thetvdb.com/api/GetEpisode.php?id=4818866&language=de"
23:11:05 T:1825258144   DEBUG: scraper: GetDetails returned <?xml version="1.0" encoding="utf-8" standalone="yes"?><details><id></id><chain function="GetArt"></chain><episodeguide><url cache="-.xml">http://thetvdb.com/api/GetEpisode.php?id=4818866&language=de</url></episodeguide></details>
23:11:05 T:1825258144   DEBUG: scraper: GetArt returned <details><url function="ParseArt" cache="-de.xml">http://thetvdb.com/api/1D62F2F90030C444/series//banners.xml</url></details>
23:11:05 T:1825258144   DEBUG: CurlFile::Open(0x71f0c220) http://thetvdb.com/api/1D62F2F90030C444/series//banners.xml
23:11:05 T:1825258144   ERROR: CCurlFile::Open failed with code 404 for http://thetvdb.com/api/1D62F2F90030C444/series//banners.xml

Thank you for very much your help or your ideas - highly appreciated!
Gerald

No scraper or fanart for TV shows

$
0
0
I'm using libreelec on raspberry PI 3, and TVHeadend for the backend and of course Kodi for the frontend. I'm using tvheadend pvr client. Also, I have two HDHomeruns HDHR and HDHR3. I'm trying to get my TV Show library to show up. I've chosen the folder and set it up for TVDB it scans and finds the content of the folder, but I don't get any information or fanart to show up under TV Show, not if I click on Title, recently added espisodes, etc there is nothing there. I have to go into the folder under files to see them. If I highlight the show I can see a picture associated with the show. How do you get the information to show up?

TMDB TV Series

$
0
0
How can you do a job in half, set the TMDB to take the language "PT-BR", but they forgot the tv series.

Merging

$
0
0
Many videos I have don't scrape because they aren't on any database. I'd like Kodi to add them regardless, using their filename as title. So there's a scraper that does that, and this is the code:

Code:
<CreateSearchUrl dest="3" clearbuffers="no">
        <RegExp dest="3" output="&lt;url&gt;http://search.yahoo.com/search?p=$$7&lt;/url&gt;" input="$$7">
            <RegExp dest="7" output="\1" input="$$1">
                <expression noclean="1" trim="1">(.+)</expression>
            </RegExp>
            <expression noclean="1"></expression>
        </RegExp>
    </CreateSearchUrl>
    <GetSearchResults dest="8" clearbuffers="no">
        <RegExp dest="8" output="&lt;results sorted=&quot;yes&quot;&gt;\1&lt;/results&gt;" input="$$5">
            <RegExp dest="5" output="&lt;entity&gt;&lt;title&gt;\1&lt;/title&gt;&lt;url&gt;http://search.yahoo.com/search?p=$$7&lt;/url&gt;&lt;/entity&gt;" input="$$1">
                <expression noclean="1" trim="1">&lt;title&gt;([^"]*)\-\sYahoo Search</expression>
            </RegExp>
            <expression noclean="1"></expression>
        </RegExp>
    </GetSearchResults>
    <GetDetails dest="3" clearbuffers="no">
        <RegExp dest="3" output="&lt;details&gt;\1&lt;/details&gt;" input="$$5">
            <RegExp dest="5+" output="&lt;title&gt;\1&lt;/title&gt;" input="$$1">
                <expression noclean="1" trim="1">&lt;title&gt;(.+)\s\-\sYahoo Search Results&lt;/title&gt;</expression>
            </RegExp>
            <expression noclean="1" trim="1"></expression>
        </RegExp>
    </GetDetails>

I'd like to merge this into the code of either tmdb.xml or universal.xml, so that I have one scraper that gets most movies with, and then uses the one above for when TMDB can't match the filename with anything. I don't know where to put it, though. If I do it at the top, it takes priority over TMDB to the point were TMDB is not used at all, and if I put it at the very end it gets ignored.
Viewing all 171 articles
Browse latest View live