Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:102213
Return-Path:
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Delivered-To: mailing list internals@lists.php.net
Received: (qmail 90504 invoked from network); 10 Jun 2018 10:03:14 -0000
Received: from unknown (HELO lists.php.net) (127.0.0.1)
by localhost with SMTP; 10 Jun 2018 10:03:14 -0000
Authentication-Results: pb1.pair.com smtp.mail=nicolas.grekas@gmail.com; spf=pass; sender-id=pass
Authentication-Results: pb1.pair.com header.from=nicolas.grekas@gmail.com; sender-id=pass
Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.218.68 as permitted sender)
X-PHP-List-Original-Sender: nicolas.grekas@gmail.com
X-Host-Fingerprint: 209.85.218.68 mail-oi0-f68.google.com
Received: from [209.85.218.68] ([209.85.218.68:40290] helo=mail-oi0-f68.google.com)
by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP
id DB/83-62758-F57FC1B5 for ; Sun, 10 Jun 2018 06:03:12 -0400
Received: by mail-oi0-f68.google.com with SMTP id f79-v6so15469558oib.7
for ; Sun, 10 Jun 2018 03:03:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20161025;
h=mime-version:sender:in-reply-to:references:from:date:message-id
:subject:to:cc;
bh=8SGmkYdoUc62NsD1LgiCZ0dKfS0ssgNxoAzu7PF93mA=;
b=AYNEdLmoZiBaxG6w1m79P0nGzbaLYVptKcIWqW9XTdxZL7CRKJZYfHQDMlwZoTO6Lu
eWstb/zUHGvM8x56O+zQLb562W+gTd6qyCind9PBRdI5YYehTNBNMM7VUwqx/twLWncD
U4vfuNzfR9YMtnoEWLii0/4dI+ImuA2oPQYFwCgJjg60UheB401ZnEi356Kyb9xlm7FD
31KlM+/sSsFXvBOm0GjTVxI+K7uKG9kCAwRM6VWsUFHv7Mp3T3lbnkG1f4REfvRNbTxw
1ZBSeA7lVg5tjNC+WTgZ/dvVqULKNcuSYq9owIRTKLZeyKhrWdeiDuCnvLIT8Vdyksp6
jzPQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
:date:message-id:subject:to:cc;
bh=8SGmkYdoUc62NsD1LgiCZ0dKfS0ssgNxoAzu7PF93mA=;
b=ehctaYB3egMEphVnuVanzPUWErjs55kfAjiqeF8AzEnRtJ1QrsDJPjkke4PXjoadJ8
oWQlK0o9nTAguxGgx/BoMp8/quVuA7hCu1311azSuiqsRVLf0d0I3atVgbCAVmmLOKV2
ZSERUGU5xjvFjephgsdagiOvmFEk0ucTygsPGnPp3lmKWQamquRDte1opKAEnhk6ntZH
A7K+OJXuR+qsEAzCIYm+aSpN9O+wbkK+qE8G3pBLVbJ66phuCIkWXzvOSh5i0Ba6k7ns
ptoKHiRicLAfRkl4/ESWTZwVAf0ETaheueQ7pulbmBUVahqwu8I/STdqaZL29Do87rbX
KD9Q==
X-Gm-Message-State: APt69E2rH3PXDrWLe8nNMNgY08Lm28toTsyr9/GUEPYEQMsDXlCJvMN2
Se+TmYr7uLQ7KxwY48uFkPYLwJ5AK2sZVP4f86Y=
X-Google-Smtp-Source: ADUXVKJTCZoG5ng0ArVYVtizGwFRoI8B4u/Ra4PHAl56m4rahyq4RZiZvYWhBOiR9b9xqmV0qfbMiiQa98ZYMTNPqGw=
X-Received: by 2002:aca:5406:: with SMTP id i6-v6mr6820597oib.115.1528624989088;
Sun, 10 Jun 2018 03:03:09 -0700 (PDT)
MIME-Version: 1.0
Sender: nicolas.grekas@gmail.com
Received: by 2002:a4a:8b28:0:0:0:0:0 with HTTP; Sun, 10 Jun 2018 03:02:48
-0700 (PDT)
In-Reply-To:
References:
Date: Sun, 10 Jun 2018 12:02:48 +0200
X-Google-Sender-Auth: DoUxdNzLDZQ3PAMwm90WfsjfRts
Message-ID:
To: Nikita Popov
Cc: PHP Internals List
Content-Type: multipart/alternative; boundary="000000000000783032056e46bd82"
Subject: Re: [PHP-DEV] A replacement for the Serializable interface
From: nicolas.grekas+php@gmail.com (Nicolas Grekas)
--000000000000783032056e46bd82
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Please allow me to up vote for this proposal.
I'm working on some serialization logic these days, and `Serializable` is
totally broken: it breaks internal references in serialized data
structures, and breaks custom serializers (e.g. igbinary=C3=A0 from inspect=
ing
nested structures.
The proposal here would just fix all these issues.
+1000 from me for what it matters :)
2017-04-21 15:50 GMT+02:00 Nikita Popov :
> On Fri, Apr 21, 2017 at 2:47 PM, Micha=C5=82 Brzuchalski <
> michal.brzuchalski@gmail.com> wrote:
>
> > I know my voice is doesn't mean anything but IMHO interface with magic
> > methods could bring more inconsistency.
> >
> > I know PHP is consistently inconsistent but I would prefer if it is
> > posdible to fix an issue with present method naming.
> >
> > Cheers
> >
> Magic methods have a distinct backwards compatibility advantage. They all=
ow
> you to add __serialize/__unserialize to an existing class that currently
> uses Serializable. Older PHP versions will then use the Serializable
> implementation, while never versions will use __serialize/__unserialize. =
An
> interface makes this a lot more complicated, because you either have to
> bump your PHP version requirement (unlikely), or you have to provide a sh=
im
> interface for older PHP versions. This shim interface would then be part =
of
> any library currently implementing Serializable, which seems sub-optimal =
to
> me. That's why I think magic methods are better for this case (though I
> don't strongly care).
>
> Also, to answer an OTR question: We cannot simply reuse the Serializable
> interface by allowing an array return value from
> Serializable::unserialize(). The array return value is only a means to an
> end: the important part is that the new serialization mechanism does not
> share serialization state -- using arrays instead of strings is just a
> convenient way to achieve this. However, Serializable::unserialize()
> currently shares the state and we cannot change this without breaking BC =
--
> so we cannot reuse this interface.
>
> Nikita
>
>
>
> > 21.04.2017 11:39 "Nikita Popov" napisa=C5=82(a):
> >
> >> Hi internals,
> >>
> >> As you are surely aware, serialization in PHP is a big mess. Said mess
> is
> >> caused by some fundamental issues in the serialization format, and
> >> exacerbated by the existence of the Serializable interface. Fixing the
> >> serialization format is likely not possible at this point, but we can
> >> replace Serializable with a better alternative and I'd like to start a
> >> discussion on that.
> >>
> >> The problem is essentially that Serializable::serialize() is expected =
to
> >> return a string, which is generally obtained by recursively calling
> >> serialize() in the Serializable::serialize() implementation. This
> >> serialize() call shares state information with the outer serialize(), =
to
> >> ensure that two references to the same object (or the same reference)
> will
> >> continue referring to a single object/reference after serialization.
> >>
> >> This causes two big issues:
> >>
> >> First, the implementation is highly order-dependent. If
> >> Serializable::serialize() contains multiple calls to serialize(), then
> >> calls to unserialize() have to be repeated **in the same order** in
> >> Serializable::unserialize(), otherwise unserialization may fail or be
> >> corrupted. In particular this means that using parent::serialize() and
> >> parent::unserialize() is unsafe. (See also
> >> https://bugs.php.net/bug.php?id=3D66052 and linked bugs.)
> >>
> >> Second, the existence of Serializable introduces security issues that =
we
> >> cannot fix. Allowing the execution of PHP code during unserialization =
is
> >> unsafe, and even innocuous looking code is easily exploited. We have
> >> recently mitigated __wakeup() based attacks by delaying __wakeup() cal=
ls
> >> until the end of the unserialization. We cannot do the same for
> >> Serializable::unserialize() calls, as their design strictly requires t=
he
> >> unserialization context to still be active during the call. Similarly,
> >> Serializable prevents an up-front validation pass of the serialized
> >> string,
> >> as the format used for Serializable objects is user-defined.
> >>
> >> The delayed __wakeup() mitigation mentioned in the previous point also
> >> interacts badly with Serializable, because we have to delay __wakeup()
> >> calls to the end of the unserialization, which in particular also
> implies
> >> that Serializable::unserialize() sees objects prior to wakeup. (See al=
so
> >> https://bugs.php.net/bug.php?id=3D74436.)
> >>
> >> In the end, everything comes down to the fact that Serializable requir=
es
> >> nested serialization calls with context sharing.
> >>
> >> The alternative mechanism (__sleep + __wakeup) does not have these
> issues
> >> (anymore), but it is not sufficiently flexible for general use: Notabl=
y,
> >> __sleep() allows you to limit which properties are serialized, but the
> >> properties still have to actually exist on the object.
> >>
> >> I'd like to propose the addition of a new mechanism which essentially
> >> works
> >> the same way as Serializable, but uses arrays instead of strings and
> does
> >> not share context. I'm not sure about the naming (RealSerializable,
> >> anyone?), so I'll just go with magic methods __serialize() and
> >> __unserialize() for now:
> >>
> >> public function __serialize() : array;
> >> public function __unserialize(array $data) : void;
> >>
> >> From a userland perspective the implementation should be the same as f=
or
> >> Serializable methods, but with interior serialize()/unserialize() call=
s
> >> stripped out. Right now Serializable implementations already usually
> work
> >> by doing something like "return serialize([ ... ])", this would change
> it
> >> to just "return [ ... ]" and move the serialize()/unserialize() call
> into
> >> the engine, where we can perform it safely and robustly.
> >>
> >> The new methods should reuse the "O" serialization format, rather than
> >> introducing a new one. This allows a measure of interoperability with
> >> previous PHP versions, which can still decode serialized strings from
> >> newer
> >> versions using __wakeup().
> >>
> >> If an object has both __wakeup() and __unserialize(), then
> __unserialize()
> >> should be called. If an object implements both
> Serializable::unserialize()
> >> and __unserialize(), then we should invoke one or the other based on
> >> whether "C" or "O" serialization is used.
> >>
> >> Thoughts?
> >>
> >> Nikita
> >>
> >
>
--000000000000783032056e46bd82--