Associated revisions

The pattern list functions are used for regular expression matching.We move them to a separate file and rename them to have aneit prefix instead of opentv prefix so they can be shared withother eit modules.

From this we can use a regular expression match toextract the season and episode data on a best effortbasis. This logic is based on the opentv extractor.

This is done via config files that are named after thegrabber module and exist in this directory:data/conf/epggrab/eit/fixup/Example names would be uk_freeview.

If the module-specific config file does not exist then wefallback to trying the first component of the filename.

In the above example that would be "uk". This avoids havingduplicate files in the case where we have DVB-S and DVB-Tin the same country that share the same extraction regex.

The configuration file should contain season_num andepisode_num sections that can contain multiple regularexpressions to apply in sequence until one producesa match.

For DVB-S, the configuration file normally needs to be copied toa file named "eit" since data is broadcast via that mechanism.This isn't done by default since the eit grabber is used bymultiple countries that may use different regular expressions.

eit: Allow EIT scraping of season/episode to be disabled at GUI. (#4287).

We now have a tick box in the OTA configuration to enable/disablethe scraping of season/episode numbers from the eit grabbers.This will allow us to add other scrapers and tidy-ups in thefuture (such as removing "Also in HD" from the summary dataor "New:" from the title), and allow the user to disable onesthey do not want for very low-spec machines or due to theirduplicate rules relying on pre-tidy data.

To achieve this configuration, we now derive our eit grabbersto be a "...scraper" type and hook in to the activate callbackto load/unload the regular expressions.

The loading of the config also had to be moved to the activaterather than in the module create to allow us to access the"scrape enabled" boolean.

History

Hi, I'm testing version 4.3-448~g2f07ea0. In the web interface I can find the new section "Scrape behaviour", but I don't understand how should I use it to scrape series/episode EIT information.For example I would like to scrape the information from a subtitle field like "St.5 Ep.1", that would mean season 5 episode 1.But I can't find any eit/scrape directory in my path /home/hts/.hts/tvheadend/epggrab/.

The eit/scrape directory (by default) should be part of your package. For example on debian/ubuntu you could

dpkg -L tvheadend | grep scrape

and this gives me:

/usr/share/tvheadend/data/conf/epggrab/eit/scrape

There's currently only one scraper in there which scrapes from summary data and can be used as a basis for your country/grabber.

Perhaps best way forward is to raise a separate bug for your specific grabber, mention all the grabbers enabled, what the country is (or a better description for a name for your config) and a few exact examples of how season/episode look in your data (such as screenshot and paste text in to message too).

If you look at #4509 you'll see a sample patch for getting it to work for another config, but it will take time to get more written.

@KonermannI've looked at the code and see no technical problems applying it across all three.

The only issue I see (other than slight performance hit) is maybe we will get false matches. But, let's try and if it fails we will have to consider a separate tick-box/regex.

I don't think my broadcaster gives me anything that will match title/description so I can't test. If I put a patch here can you compile and test before it's formally submitted? It will be a couple of days since I have one other change in the area I want to submit first.

The --tsfile sounds interesting but not managed to get it to work yet.So running "mediainfo" on an old ts recording shows EIT information, and tvh says it is scanning (after enabling the mux for uk_freesat), but it's not hitting my log statements. I'll look in to it later.

Try the attached patch. I can't test it properly since I don't have description broadcast, but it still works OK with my summary scrape.

It applies against current master.

I took a look at the proposed patch but not yet tried it.One question: why for both season and episode is present the same line "changed |= EPG_CHANGED_EPISODE;" ?And not for example "changed |= EPG_CHANGED_SEASON;" and "changed |= EPG_CHANGED_EPISODE;" ?Actually another question: what would be the value of EPG_CHANGED_EPISODE?

If season/episode is present in both Title and Description (and Summary) would it be overwritten?Or is there a check to skip further readings if the information was already retrieved from the previous fields?

@KonermannGlad it works. I'll give it a couple of days in case there are any issues (since this patch is also useful in Italy) and then submit it formally as a pull request.

@sivieroGood questions.

The series and episode are actually both contained inside an existing type called "epg_episode_num". So I used the same EPG_CHANGED_EPISODE flag for both since it is a change to the underlying epg_episode_num and requires a later call to "epg_episode_set_epnum" (which sets both series and episode number).

However there exists EPG_CHANGED_EPSER_NUM and EPG_CHANGED_EPNUM_NUM that are more closely related to the EPG_CHANGED_FIRST_AIRED so I'll change it to use that to make it clearer for people.

If details are present in all fields then they will be overwritten. So potentially you could scrape season from title and episode from description.

Unfortunately scraping a string to populate the content type (which internally is a number) is not too easy. I think there are over a hundred genres in DVB spec (such as football, tennis, martial sports, soap, romance, religious, horror, etc).

@Em Smith1. Can you make to use also Title for Episode, because of example program Episode number is Title, not in Subtitle like in most channels.2. Why here don't work on season(сезон) and short of season s. (с.)When have season after episode work, but when is only one season don't work.

Unfortunately scraping a string to populate the content type (which internally is a number) is not too easy. I think there are over a hundred genres in DVB spec (such as football, tennis, martial sports, soap, romance, religious, horror, etc).