Often you will get a "None" as response or sometimes the correct values, but with runaway values to 0.0.
I tried to bring more reliablilty with external pull-up, but it does not help.
Tested on two independent LoPy4, same behaviour. LoPy is reading DS18X20 sensors stable.

@livius Then it's better to keep the second sleep_us(1)'s in both the read_bit() and write_bit(). In the latter, it does not hurt anyhow. It just ensure a minimum time between two write_bit() events.
For read_bit() it also depends on the rise time of the output. You could try if the figure changes if you add a smaller pull-up resistor, like 4k7 Ohm.

@affoltep Looking at the data sheet, that seems OK. Faster is better. For reading, it says:
"Output data from
the DS18B20 is valid for 15μs after the falling edge that
initiated the read time slot. Therefore, the master must
release the bus and then sample the bus state within
15μs from the start of the slot."

@livius@affoltep Just tried the same code on the LOLIN32 pro, which should be similar to a WiPy3. That devices behaves the same way as the FiPy (not surprising).
Next attempt on a Wemos LOLIN32 lite, which is closest to the WiPy2 I can get, which shows the same figures as the LoPy.
All tested with the "if addWait" addition.
(sorry @daniel and @jmarcelino, that I do not have the full set of Pycom modules here)

@livius I agree about the second sleep, but even on WiPy2 it should work without the first sleep. Did you try that it fails? The D18B20 requires a minimum low time of 1µs. On my LoPy, the pin(0)/pin(1) sequence creates a low pulse of ~2 µs. Changing that to:

addWait = False
pin(0)
if addWait:
sleep_us(1)
pin(1)

changes the pulse to ~3µs on the LoPy, sometimes extended to 4 µs.
On the FiPy with that change the basic pulse width is too 3 µs, but pulses are more frequently extended and up to more than 15 µs, breaking the communication with the DS18B20.
And @affoltep could try the variant w/o sleep with his LoPy too. I have no DS18B20 here, otherwise I would have tried that already.

@livius Hm, but LoPy, which is from architecture view similar to Wipy2 can as well not generate a 1us break. Is it really needed?
I tested the new lib as well on Lopy and it did work without the seleep_us(1)

@robert-hh Wow, this could mess up the whole communication at the beginning. I have done now all modifications, including the gc.collect() command in the beginning of read_bytes and write_bytes.
I will let the device run during night to check, whether the communication is more stable. Will let you know more tomorrow.

@affoltep I made another test, rebuilding just the initial part of write_bit(1) (which is similar to read_bit()). It shows the Initial low period with and w/o the sleep_us(1) lines.
Timing with sleep_us(1)
Timing without sleep_us(1)
With the sleep_us(1) call, this initial period can extend to about 30 µs, effectively writing a 0 or missing the read window at 15µs. Without that sleep, the initial pulse time is still sometimes extend, but much more seldom and shorter than 10µs, being still in the safe range.

@affoltep Looking at the example code for readbit(), both sleep_us(1) are obsolete, since the shortest effective distance I have seen between a pin(x) and pin(y) is 3 µs, and the call of sleep_us() alone takes at least 3-4 µs.

@robert-hh, @livius I included a gc.collect() in the reset, read_bit and write_bit functions of the onewire lib.
Of course it takes now ages to get a temperature value (~15s) but it seems a little bit more reliable. There are still missreadings, but it seems indeed a little bit better. Will do some tests over night to proof...

Any other ideas to get a workaround? Would it be possible to solve this issue by firmware?

@livius This is on a plain board, nothing else started after boot expect for connecting the WiFi to an AP.
The FiPy was sitting on an expansion board, the LoPy on a breadboard, Firmware 1.14.0.b1. No sensor module attached, just the output on P11.
Besides that, from similar test, like IRQ latency, the results are NOT surprising. Making timing precise on chips with serial flash (and serial RAM) requires a lot of consideration.

So it outputs a 20 µs pulse and then consumes some memory. I logged the output with the oscilloscope in pertinence mode, so you see the varying pulses (fipy yellow, Lopy green):
FiPy:
LoPy:
Both results are not good, like there is no pulse shorter than ~40 µs, but the result on FiPy is worse, extending the pulse sometimes to over 500 µs.
Adding a gc.collect() immediately before the pulse creating sequence improve the situation drastically. The longest pulse seen then for FiPy is 65 µs, and Lopy shows always about 30 µſ. Please note that the time scale has changed in the pictures below.

I have done some additional investigations. I flashed back to firmware 1.9.2.b2. (the newest firmware 1.15.0.b1 generates a problem running LoPy with deepsleep shield with ,and only with battery power --> always wakes up due pin interrupt. Apparently newest firmware generates a problem if there are not 5V present.

So both LoPy4 and LoPy are running with 1.9.2.b2 identical code. I use a 10kOhm pull-up on the LoPy4 and no resistor on LoPy (just internal pull-up by the onewire config). As I discovered the instability of reading values on LoPy4 I add the resistor. Unfortunately I didn't had a 4.7k, so I took a 10k, but it shouldn't be an issue.

So at the moment, the assumption from @robert-hh seems to be the most possible. Does one of you work for pycom? It would be nice if you could integrate a C lib for onewire communication in the firmware.

One of the LoPy4 I tested is slightly more stable than the other, but never on the bullet proof level of the LoPy