Replace-plugin workes fine so far, of cource there is much to spend to get into regex most efficent.

Two things, Nyom Moritz , since the amount of 1000 matches is very less for large clean ups. Could Atma change that? Is it accessable via ftp? And is it of no problem to increase it to maybe 20.000 or even much higher?

Quote

Warning: Unknown: Input variables exceeded 1000. To increase the limit change max_input_vars in php.ini. in Unknown on line 0

Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".

since the amount of 1000 matches is very less for large clean ups. Could Atma change that? Is it accessable via ftp? And is it of no problem to increase it to maybe 20.000 or even much higher?

This file is not accessible for us, I think.But I changed the code to work around that.It should work now for larger numbers, but it could be very slow. There might come a message from the browser "script is not responding" or something, and being asked if wanting to continue the script, simply answer "yes" and wait.

Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".

I don't know. Maybe there is another syntax for it. I found this: https://stackoverflow.com/questions/34592160/regex-string-substitution-upper-and-lower-caseBut don't know at the moment if that would provide a solution.

Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".

I don't know. Maybe there is another syntax for it. I found this: https://stackoverflow.com/questions/34592160/regex-string-substitution-upper-and-lower-caseBut don't know at the moment if that would provide a solution.

_/\_

It seems such things like \L and \U for uppercase and lowercase replacement are not "standard" regex features, but only part of extra stuff in some programs.It would surely be possible to build something like this into the BatchEdit plugin as well. I am not really knowleadgable about regular expressions, but have just found a really good manual (https://www.regular-expressions.info) with clear explanations. So maybe I would try to add things like that when I understand more and find time for it.

And I also still don't know how the indexing works and how fast or slow it is. Have there ever been problems with the BatchEdit plugin showing old results that had not been updated, even when a newer version should exist?When looking at the code it seems to me that the BatchEdit plugin will always load the latest version and show matches accordingly. So a slow index might perhaps only be a problem sometimes for new pages that have not even been included in the index for the first time so that no results for it would be found...

So far, Nyom Moritz , all works fine to progress step by step. No real problems seen in regard of index as it is used only for the selection of avaliable files while the search is done direct in the files (so no need of refreshing / new indexing, except new files are added).

Quote from: https://www.dokuwiki.org/plugin:batchedit#page_lookup

Page lookup

BatchEdit uses DokuWiki page index to get the list of existing pages instead of going through the data directories. If the index is incomplete the plugin will not see some pages. This also applies to the “special” pages, for example, namespace templates.

So it's only about the list of files that batchEdit uses the index.

Index it self: Not sure for now, but it seems so, that refresh index matches also new files. It works not too slow.

In regard of regex, yes, the returns seems to be special. Hier $ seems to work more, as for place a string \1 = $1, maybe it works also for \L = $L (did not try for now).

Since Atma does not intent to learn/invest much in this skills, when ever a need arises, he would look and maybe addopt, investigate samples given here around.

It's good then, steo by Step, to make explainary pages on ATI.eu in all regards, for further future easier work and transfer.

Note that after copy the new file into cs-rm (Cattasanghayana - Roman), Atma did only a refresh index, so maybe this had caused an appearances never had before, also first time to regex in cs-rm .

oh... maybe it (this error) has to do with the uploaded image/media-files (maybe wrong Uppercase-cases), have to look at it and rename them...

Done, so far, but not reidexed for now. It seems that regex also addresses mediafiles, not clear in how far (or just when collecting posdible files avaliable. If also executing them likewise, this could be a mess probably.

Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/file-document.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 833

Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/pencil.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 833

Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/arrow-down.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on

Thinking, oh, my person added the images in the directory, having taken them from the github download (trusting that this might be welcome), and now seems to work fine, in regard of layou.

The download on docuwiki misses those images. My person told it via the forum (https://forum.dokuwiki.org/post/61558).

How ever, the resultpages misses now the amout of pages matched, and sum of matches, which is a useful controll and estimation of success point.

Oh, I forgot to include the images that were added by the original author in his most recent updates. The DokuWiki download still has the version from February. The original author, Mykola Ostrovskyy (https://github.com/dwp-forge/batchedit) has not yet created a new release version (https://github.com/dwp-forge/batchedit/releases) since February. It seems he is still working on some major changes he would like to add before the next "official" version.

So all errors here are just because I forgot to upload certain new files. But now I think it should be okay?

So just to inform, if coming across that, that is because there are too many results to keep in memory.

I had some discussion with the author who is currently in the process of making some major changes, also thinking about how to deal with huge result sets (https://github.com/dwp-forge/batchedit/issues/16#issuecomment-401596844). So I think I should mention that to him as well and maybe help and try to find a solution. But at the moment don't have much time for this.

Oh, I forgot to include the images that were added by the original author in his most recent updates. The DokuWiki download still has the version from February. The original author, Mykola Ostrovskyy (https://github.com/dwp-forge/batchedit) has not yet created a new release version (https://github.com/dwp-forge/batchedit/releases) since February. It seems he is still working on some major changes he would like to add before the next "official" version.

So all errors here are just because I forgot to upload certain new files. But now I think it should be okay?

How ever, the resultpages misses now the amout of pages matched, and sum of matches, which is a useful controll and estimation of success point.

I'm not sure how this could be. Testing from here, I get infos like this:

After "Preview":

Quote

Search results: 9808 matches on 1019 pages

After "Apply":

Quote

Edit results: 9808 matches on 1019 pages, 2 replacements applied

Maybe it's a matter of display, caused by responsibility for mobil advices.But what was just white before, contains now the matches.

/me : There seems to be a lot to understand in regard of "Zwischenspeicher"Also troubles with favicon, even on all places placed and a great deal that in cs-rm, the site takes the old version as the newer, meaning all "drafts" to recover, one by one.

The first looks like an error on Greensta's side. The database was unavailable for a moment it seems. :-| But if not happening more often, hopefully not a big problem.

The second bug was introduced by me. I just wanted to have different colors for match and replacement.So instead of having yellow for both, I wanted to have yellow and green in the preview.And red and green after the replacement.

But it seems I have not changed it for the first preview, where still both is yellow.

The original author was also wondering why I did this change. Now I see it's different for the first preview. Okay.

There has been a lot of new work been done (https://github.com/dwp-forge/batchedit/commits) in the meantime by the original author (https://github.com/dwp-forge) and others, including some really helpful new features like a progress bar, so that one can estimate how much more time a replacement will take for large updates. And much cleaner solutions to the small changes that I made.

Show confirmation on applying edits with no summary (not sure what this means; irgendeine Abfrage zur Sicherheit in manchen Fällen)

Also there is now a time limit on how long a search or replacement can take (can be changed in Admin settings). I have set this to 10 hours now. Should be enough usually.

Very helpful: there is now a progress bar for the search progress and replacement progress, helping estimating how much longer it will take (very light grey, difficult to distinguish from white).

Not tested much, hopefully not any new errors.

Edit: Just tested searching for "dhamma" with no limit of results; returns an empty result page. Probably too many results so that something gets broken.Searching for "dhamma" with limit of 16000 results works, and takes a few minutes to complete.Searching for "Johann" without limit works and gives 2076 results.

Just to report: the multiline option (when working without delimiter) seems not to work proper. All well by simple using delimiters.

Data-capacity: the limit is some over 100MB, which is problematic to replace to large things on the ati pages:

1. Spaces at line-beginning and linebreaks before tags, using find: \n[\s]+< and replace with \n\n, one one hand because replaced would nevertheless give a match and doing folder by folder would need long and has it's and at the root lang. (just lib:thai: could be managed so far, namespace thanissaro would require 300MB+, all in a lang-space propable some 10GB). A possible way, if nothing else found, is maybe 2 two step way, replacing firts with any special character and this later with two line-break. In this way matches can be reduced, slowly, slowly, step by step (about 20-50h).

2. p-tags with two line-breaks by using something like find <\/p>[\s]*<p> and replace with \n\n.

3. the many spaces and tabs between tags without touching/destroying unformated textpages (not thought in detail about it, but would be a mass-problem as well)

4. later on things like em, i, b, br, u, s-tags, while these matches can of cause be reduced step by step.

5. of cause the will be other mass-replacements harder to manage, but can be all of cause done by beggar-"tricks" and effort and patient like always.

/me : switching back to huge amount of pts-dictionary -> accessibility replacements for "dummies" and those not wishing to become schoolars or x.y.z., ax4 language speaker, Brahmans or depending on them, before or rather then gaining awakening.

I've managed to learn how to use dokuwiki a bit and have improved some details in this (http://www.accesstoinsight.eu/en/theravada) page as a sample.Between other changes for the better, I've used the dokuwiki's footnotes feature because it doesn't need to go to the bottom of the page to read it.If there is no problem with the changes, I would like to use this page as a model to edit the other pages as well.

Note that the words (those wrapped in "//") in the titles can't be displayed as italic. But this can be fixed by installing this (https://www.dokuwiki.org/plugin:wikiformatstyling) plugin.

It's pretty simple:1. Rename the plugin's directory to "wikiformatstyling"2. Place the directory in "dokuwiki/lib/plugins"

If having any general styling idea, good to give it as sample. My person is currently prossessing to regex all html stuff global on the pages.

It might need another weeks to match all. i, b, em, strong, u tags may be replaced already completely. Some a-tags are still to match, anchors and picture-links, may still make much work, images might need some manual care since needing the whole path for linking to larger picture.

The header styling seems to be great, just not sure in how far it might cause problems with other plugins like include. Generally my person thinks that the more lesser plugins the lesser troubles and maintaining issues.

Not sure if simply removing stylings in headers, which my person thought of doing after having replaced all htmls, might be not better.

Best to coordinate plugin issues with Upasaka Moritz and also let him keep the overview about installations, at least known if he might not have time.

Some comments on the edits Nyom Danilo made:

Footnote: generally good to use wikis tools, but in regard of many many pages, and the immobility of global replacement without errors, since very different, my person would not make use or it for old pages. The use of wiki only becomes also a problem for extended footnotes, incl. blockquotes, lists... a great challenge even with the ya-list mode, but possible.

Removing div-tags and adding styling tags: Till now Atma looked to simply bring all to one standard. So there might be parts in the header which will then global removed or replaced. The css has one some stylings yet and is not done for now.

Anchors also still in header and under removing: Althought indexes could be removed there are many cross-links on other pages. If removing the original anchors one would need to seek for all links to them and change them as well. Huge work and possible so far, but in regard of old links all around the internet (zze-links will get later redirects to ati), one would cause a lot of "death" links. So anchors should be best never removed.

Still there are divisions (use of div instead of WRAP general preferred) wraping headers. This causes the section-edit not to work. Atmas objectives are to fix that global but only after all htmls are replaced.

Further: there is no need for particular styling of headers since that can, actually is already, made in the css-sheed. They have already centeralign styling.

<blockquote>"​Birth is ended, the holy life fulfilled, the task done! There is nothing further for the sake of this world."​ + <WRAP indent> - <cite> [[en:​ptf:​buddha#​done|MN 36]]</cite></​blockquote>

They are all fine already as they are it allready replaces with wiki and wrap tags, like this. It does display wrong as code because there is still a tab at the beginning of the line. Removing just the tab here will display it perfect so far. blockquote is a additional plugin ati uses, incl. cite-tag

html-values:

There are still such as &​iuml;​ around. If coming across, best to select them in a list with its proper replacement so my person could make this global for all pages.

Footer:

No need to but much effort in manual editing the divs and styling. That is an issue for thousands of pages and will be made at "once".

Content edits - Styling edits:

If seeing certain typos, small style issues in text,... aside of div-, span-tags, great if correcting. If seeing something strange of an old html tag, best to report it and collect on one place.

Sadhu for efforts! And mudita.

Atma thinks, how ever, easier to undo the edits and repeat some of the small, incl. As Upasaka thinks that it is well, now possible more informed.

It's pretty simple:1. Rename the plugin's directory to "wikiformatstyling"2. Place the directory in "dokuwiki/lib/plugins"

Installing plugins: it's best, secure and easy made via the Admin-panel. Not sure if Nyom Danilo has admin rights, which should be no problem. Good how ever, since some tools are very powerful and could even destroy much, to coordinate with Nyom Moritz or ask my person if not sure in some regards.

In regard of distinguishing tags and divs, whether already changed or old ones. Old htmls tags incl. always ="..." or and other marks. For the stylings for wike tags always look like this <div class_texts #anchor_text> or <span class_texts #anchor_text> or <span #anchor_text> or <div class_texts>. If seeing others then this, old, best to list them with a link to the place where seen, in a list (maybe a topic only for that, or here just a post).

Not sure if Nyom Danilo is familiar and skilled with regex, powerful, but also dangerous to destroy a lot. Let it be know if wishing to use it for global changes.

I have not read everything in detail now, but have, after taking a quick look at it, installed the WikiStyle Script (https://www.dokuwiki.org/plugin:wikiformatstyling) plugin.It seems the plugin does not change any stored data, but only affects how things are rendered.

Possible that there might be conflicts with other plugins like the include-plugin, as Bhante says. But it would not destroy any data.If there are any problems with it, one can simply uninstall/deactivate it and maybe think of other solutions.

Not yet looked much at any results and if all works correctly, but the example page (http://www.accesstoinsight.eu/en/theravada) seems to look fine so far.

Not sure if Nyom Danilo is familiar and skilled with regex, powerful, but also dangerous to destroy a lot. Let it be know if wishing to use it for global changes.

I have some experience with regex.

If Bhante thinks it's a good idea, he could specify the patters to be matched in the html pages and the output data of the dokuwiki's pages, thus a standard model would be clearly defined to be used as reference and I (or anyone else) could come up with the corresponding regex rules and do the appropriate changes.

When editing the html page, I had saw many tags which didn't appeared to had any effect. So I end up removing it.

I have not read everything in detail now, but have, after taking a quick look at it, installed the WikiStyle Script (https://www.dokuwiki.org/plugin:wikiformatstyling) plugin.It seems the plugin does not change any stored data, but only affects how things are rendered.

Possible that there might be conflicts with other plugins like the include-plugin, as Bhante says. But it would not destroy any data.If there are any problems with it, one can simply uninstall/deactivate it and maybe think of other solutions.

Not yet looked much at any results and if all works correctly, but the example page (http://www.accesstoinsight.eu/en/theravada) seems to look fine so far.

Althought no problem, Nyom Danilo , general in Dhamma, never, if correcting, risk that something get lost. If not seeing the use for now, simply "hide" it. In that way the Dhamma could be maintained till in our days.

Atma, as told, will need some days more to replace the most (lists and tables will need manual edits).

Next step would be to bring it in a nice easy standard and creat templets for new pages.

Ati had a huge standard and my person thinks that most is good to carry on. It needs a while to understand all (working now 7+ years with it, still finding hidden treasures).

Is Nyom familar with css?

Some of the ATI.synax features he can find at Ati.eu Syntax. Detail Doku is not written for now, even having started. The topic old posts here give some impressions for understanding.

To investigate and see of what Atma is currently doing, best to check Activity Lists or http://accesstoinsight.eu/index?do=recent

If wishing to use regex for many pages (be careful, can damage much and not easy till impossible to recover, incl all Sanghayana Tipitakas) he finds the batcheditor (https://www.dokuwiki.org/plugin:batchedit) tool in the Admin area.

(The last year+ Atma had started all anew surely 2,3 times... 10.000 of pages, because some mistakes...)

relative links, missing ./ or ../ at the beginning would not work for now. General my person tends to replace relative links with the whole path, starting with :en:....:file

since the structure has been slightly changed in regard of the tipitaka folder, starting with adding :sut: and renaming folders already toward the cs-rm standard (not all done for now, especially in the kn folder) not all links are correct for now and need work.

media-links may be not correct for now. Since image links with destiny require an obsolute url, having either host/_detail/en/.../file (display of pictures within the media manager) or host/_media/en/.../file (direct download) it's not possible to replace relative links correct more global. Alternative would be to either change the logos used by media-manager or to let go of the zze-logos for certain files.

links which have been generated by script in the doc-info at the footer need to be made as static one.

It might be that my person has deleted certain anchors which gave reference to headers from the index and also other pages. Those would be needed to be renewed.

There is no idea of how to maintain hover-texts in good ways. Most have been given up. The rest has been placed within images (Info sign)

My person will focus now on replacing the rest of div and span tags.

The Portuguese pages, btw, since even only 3 pages, would had increased the work and time by 50% are left behind meanwhile, thinking much faster to edit the three pages manual next to the global replacements.

The last and greatest challenge of replacements will then be that of the special lists and tables... and of course many small and special things will be left.

Divisions should be "fine" already. Some less alian Spans are left, and will be made tomorrow, when sun is shining.

There is one issue in regard of classes, but more over on id's: low letters. While notepad+ gives the possibility to replace with the lower-case value, batch-edit seems not to support \l$1. Since there might be lot of anchors and links to them (which are not so problematic since cut down to low case by the system if right, also for anchor extention) containing upper-case, maybe someone has an idea how to transform them more global.

That looks strange. No clear idea at the moment.Knowing exactly what the regex to replace was might help to understand it better.

Connection problems should not be a possible reason.Possibly a programming error from some of the modifications I made.

I think it is best to to keep the old revision, better than trying to replace from this result.

I can look in the morning, or if Bhante could send me the FTP password (already have the password now), maybe I find something useful from here (at work now, in taxi, but a "boring" night so far only sitting and waiting).

As it infected all non-standard characters: maybe batchedit has any process which deals with char-sets and which was possible interrupted by connectivity. Sometimes, during no reaction, Atma would send orders also twice which might "disturb" ongoing prozesses.

h3 and h4 has been made with similar regex before, but didn't touch that sample page (maybe 20at all infected).

Something un-usual is that this change is printed in gray in the list, possible pointing on something.

where the UTF8 encoding of special characters somehow was garbled, which can be seen in the comparison (http://accesstoinsight.eu/en/lib/authors/thanissaro/bmc/section0059?do=diff&rev2%5B0%5D=1564461374&rev2%5B1%5D=1564668616&difftype=sidebyside) between these two directly subsequent revisions.

That is the obvious error that I have seen. Everything else mentioned is not clear to me:

Interesting is also that the replacments, althought looking similar, are uniqu on each page. So the additional added invisible characters are different while the string appears to be equal.

I have not seen any other page than BMC section 59 where the mentioned UTF8 encoding error happened, exactly between these two revisions (http://accesstoinsight.eu/en/lib/authors/thanissaro/bmc/section0059?do=diff&rev2%5B0%5D=1564461374&rev2%5B1%5D=1564668616&difftype=sidebyside).

Can Bhante point to other pages where this happened?

In any case this looks like an encoding error, which happened only one or maybe more times (which I have not seen) when saving some page(s) with BatchEdit.

My first idea would have been that maybe the file(s) was/were edited in an external program in between which could not deal properly with the UTF8 encoding and saved it wrongly.DokuWiki has some mechanism for recognizing external edits and including them as such in the revision history. But it might be that it does not always work, does not always become "aware" when something was changed from outside (I think the check only happens when saving), so that the change would appear included in another change (in this case a replacement which was mostly correct, but based on a file that had already been modified and wrongly encoded in between).Not sure about that, if that would be possible that especially BatchEdit might skip the mechanism of "being aware" if something has changed from outside.

Apart from that, I have seen that BatchEdit has become a lot more complex since the last time I looked into the code. Many things happen which I don't understand so quickly. For example it looks like the results of a BatchEdit search (with matches and replacements, even before they are applied) are stored in a certain structure in some temporary files, and most likely they are read back again from those files when finally applying the replacements. Maybe it could happen somehow that the data gets wrongly encoded there sometimes and reloaded from there afterwards for some reason with the wrong encoding. But that is just vague speculation now since I don't really yet see how it all works.

As long as not possible to replicate the error in some clear test case I think it will be difficult to figure it out. ::)

_/\_ _/\_ _/\_

/me now probably not having much time the next days or week to find the reason.

The idea of invisible characters came because a search of the defect strings would only match the page where copied from. As for which pages are effected, just the next, next, ... page of BMC, seemingly only in the BMC2 part.... ohh, ... No. Because the next page links are wrong, always the same page.

So it's just one page. My person then guesses it's because of a connection problem caused certain action not to be fullfilled. Atma also thinks that it would not be that worthy to investigate the plugin fully and redevelop it. Maybe, how ever, good if informing the developer that such happens.