1 Introduction

The study of group encryption - the encryption analogue of group signatures [17] - was initiated by Kiayias, Tsiounis and Yung (KTY) [29] in 2007. While group signatures allow the signers to hide their identities within a set of certified senders, group encryption protects the anonymity of the decryptors within a set of legitimate receivers. To keep users accountable for their actions, signatures/ciphertexts can be de-anonymized in cases of disputes, using a secret key possessed by an opening authority.

In a group encryption scheme, the sender of a ciphertext can generate publicly verifiable proofs that: (i) The ciphertext is well-formed and can be decrypted by some registered group member; (ii) The opening authority can identify the intended receiver should the needs arise; (iii) The plaintext satisfies certain requirements, such as being a witness for some public relation.

Group encryption (GE) schemes are motivated by a number of appealing privacy-preserving applications. A natural application is for encrypted email filtering, where GE allows a firewall to accept only those incoming emails that are intended for some certified organization user. If accepted, the encrypted messages are guaranteed to satisfy some prescribed requirements, such as the absence of spammy/unethical keywords or the presence of keywords that are of the organization’s interests.

As pointed out in [29] and subsequent work [1, 15, 32, 36], GE can also find interesting applications in the contexts of anonymous trusted third parties, oblivious retriever storage systems or asynchronous transfers of encrypted datasets. For instance, it allows to archive on remote servers encrypted datasets intended for some anonymous client who paid a subscription to the storage provider. Furthermore, the recipient can be identified by a judge if a misbehaving server is found guilty of hosting suspicious transaction records or any other illegal content.

From the theoretical point of view, one can build a secure GE scheme based on anonymous CCA2-secure public key encryption schemes, digital signatures, commitments and zero-knowledge proofs. The designs of GE are typically more sophisticated than for group signatures, due to the need of proving well-formedness of ciphertexts encrypted via hidden-but-certified users’ public keys. In particular, as noted by Kiayias et al. [29], GE implies hierarchical group signatures [50] - a proper generalization of group signatures [4, 5].

In their pioneering work, Kiayias et al. instantiated GE based on the Decisional Composite Residuosity (DCR) and the Decisional Diffie Hellman (DDH) assumptions. The zero-knowledge proof of ciphertext well-formedness in their scheme is interactive, but can be made non-interactive in the random oracle model using the Fiat-Shamir transformation [21]. Cathalo et al. [15] subsequently proposed a non-interactive realization based on pairings in the standard model. El Aimani and Joye [1] then suggested various efficiency improvements for pairing-based GE. The first construction of GE from lattice assumptions was later presented by Libert et al. [32].

Libert et al. [36] enriched the KTY model of GE by introducing a refined tracing mechanism inspired by that of traceable signatures [28]. In this setting, the opening authority can release a user-specific trapdoor that enables public tracing of ciphertexts sent to that specific users without violating other users’ privacy. Izabachène et al. [26] suggested mediated traceable anonymous encryption - a related primitive that addresses the problem of eliminating subliminal channels.

Current Limitations of GE. To date, GE has been much less well-studied than group signatures [17], even though they are functionally dual to each other. The group signature primitive has a longer history of development, and serves as a primary case study for privacy-preserving authentication systems. Meanwhile, GE was introduced close to the rises of powerful encryption systems such as attribute-based [25], functional [8] and fully-homomorphic [22] encryption, and has not gained much traction. Nevertheless, given its compelling features and the nice applications it can potentially offer, we argue that GE deserves more attention from the community. In this work, we thus aim to contribute to the development of GE. To start with, we identify several limitations of existing GE systems.

First, the problem of user revocation, which is a prominent issue in multi-user cryptosystems, has not been formally addressed. The KTY model [29], while allowing dynamic enrolments of new users to the group, does not provide any mechanism to prevent revoked users (e.g., those who were expelled for misbehaviours, stopped subscribing to the services or retired from the organizations) from decrypting new ciphertexts intended for them (unless the whole system is re-initiated). We next observe that, although it was not discussed by authors of [36], their refined tracing method might pave the way for a mechanism akin to verifier-local revocation [9], in which verifiers test incoming ciphertexts using the trapdoors corresponding to all revoked users. Beside incurring complexity linear in the number of revoked users, such a mechanism is known to only provide a weak notion of anonymity (called selfless-anonymity) for non-revoked users. A formal treatment of fully dynamic GE (i.e., which supports both dynamic enrolments and revocations of users) with strong security requirements is therefore highly desirable.

The second limitation is about the usefulness of existing GE schemes in the context of email filtering - which is arguably the most natural application of the primitive. Recall that such filtering functionality is supposed to be done by defining a relation \(R = \{(x, w)\}\) and accepting only messages w such that \((x, w) \in R\), for a publicly given x. However, in all known instantiations of GE, the relations for messages are defined according to the computationally hard problems used in other system components. More precisely, the KTY scheme [29] employs the discrete-log relation, i.e., it only accepts w if \(g^w = h\) for given (gh). Subsequent works follow this pattern: pairing-based relations are used in [1, 15, 36] and a Short-Integer-Solution relation is used in [32] for message filtering. While such treatment does comply the definitions of GE, it seems too limited to be useful for filtering spams. Designing GE schemes with expressive policies that capture real-life spam filtering methods is hence an interesting open question.

Third, regarding the diversity of concrete computational assumptions used in building GE, among all existing schemes, the only one that is not known to be vulnerable against quantum computers [48] is the lattice-based construction from [32]. This raises the question of realizing GE based on alternative quantum-resistant assumptions, e.g., those from codes, multivariates and isogenies. In terms of privacy-preserving cryptographic protocols, other post-quantum candidates are much less developed compared to lattice-based constructions, and it would be tempting to catch up in the scope of GE.

Our Contributions and Techniques. This work addresses all the 3 limitations of existing GE that we discussed above. Our first contribution is a formal model for fully dynamic group encryption (FDGE), with carefully defined syntax and robust security notions. Our model empowers the KTY model with the user revocation functionality and paves the way for new instantiations of GE in which enrolling new users and revoking existing users can be done simultaneously and efficiently. As a second contribution, we suggest to realize message filtering for GE not based on computationally hard problems, but a list of keywords and how these keywords match with substrings of the encrypted messages. To this end, we define 2 policies for accepting “good” messages and rejecting “bad” ones, that capture the spirit of the String Matching problem and the Approximate String Matching problem that are widely used in contemporary spam filtering techniques. Our third contribution is the first code-based GE scheme that follows our FDGE model and that supports both of the 2 message filtering policies we propose. We provide more technical details in the following.

Group Encryption with Full Dynamicity. We formalize the primitive of \(\mathsf {FDGE}\) as the encryption analogue of fully dynamic group signatures [10]. Beyond the usual joining algorithm of the KTY model [29], \(\mathsf {FDGE}\) makes it possible to update the group periodically to reflect user revocations. Our model is defined in a way such that it captures the 2 most commonly used approaches for handling user revocations in group signatures, based on revocation lists [13] and accumulators [14]. As noted in [10], there is an attack inherently to group signature schemes following the revocation-list-based approach. When translated into the GE context, such attack would permit group users to decrypt ciphertexts sent to them even before they join the group. Our FDGE model does not allow such attack, and we view this as a preventative measure in case a revocation-list-based revocable \(\mathsf {GE}\) will be proposed in the future.

Regarding security requirements, we define the notions of message secrecy, anonymity, and soundness that are inline with and carefully extended from the KTY model [29]. We consider adversaries with strong capabilities, including the ability to corrupt the group manager (\(\mathsf {GM}\)) and/or the opening authority (\(\mathsf {OA}\)) to a large extent. Specifically, not only do we permit full corruptionFootnote 1 of the \(\mathsf {GM}\) and/or \(\mathsf {OA}\) when defining message secrecy and anonymity, but we also tolerate maliciously generated keys for the fully corrupted authorities. In terms of soundness, only partial corruptionFootnote 2 of \(\mathsf {OA}\) is allowed. Note that the assumption on partially corrupted \(\mathsf {OA}\) is minimal, since otherwise a fully corrupted \(\mathsf {OA}\) could simply refuse to open ciphertexts.

Message Filtering. Spamming and spam filtering are complicated areas, and currently there is no single filtering solution that can address all the clever tricks of spammers. In the present work, we do not attempt to invent such a solution. Our goal is to equip GE schemes with some basic, yet commonly used policies for filtering. More precisely, we suggest to employ a public list \(S = \{\mathbf {s}_1, \ldots , \mathbf {s}_k\}\) of k binary keywords, each of which has bit-length t, to test against length t substrings of the encrypted message \(\mathbf {w} \in \{0,1\}^p\). This list can be regularly updated by the GM, depending on the interests and needs of the organization. The keywords \(\mathbf {s}_i\) could either be “good” ones that all legitimate messages are expected to contain, or be “bad” ones that should be far - in terms of Hamming distance - from all substrings of \(\mathbf {w}\). Respectively, we consider the following 2 policies.

  1. 1.

    “Permissive”:  \(\mathbf {w}\) is a legitimate message if and only if there exists \(i \in [1,k]\) such that \(\mathbf {s}_i\) is a substring of \(\mathbf {w}\). This policy captures the String Matching problem and can be applied when the current interests of the group are reflected by the keywords \(\mathbf {s}_i\)’s, and all messages that do not contain any of these keywords are rejected.

  2. 2.

    “Prohibitive”:  \(\mathbf {w}\) is a legitimate message if and only if for every length-t substring \(\mathbf {y}\) of \(\mathbf {w}\) and every \(\mathbf {s}_i \in S\), their Hamming distance is at least d. This policy is related to the Approximate String Matching problem. Here, the keywords \(\mathbf {s}_i\)’s could correspond to topics that are unethical, illegal, adultery, or simply out of the group’s interests. The requirement on minimum Hamming distance d is to address spammers who might slightly alter \(\mathbf {s}_i\) so that it passes the filtering while still being somewhat readable.

Having defined the policies, our next step is to derive methods for proving in zero-knowledge that the secret message \(\mathbf {w}\) satisfies each of the policies, which will be used by the message sender when proving the well-formedness of the ciphertext. Let us discuss the high-level ideas.

Regarding the permissive policy, our observation is that if we form matrix \(\mathbf {W} \in \{0,1\}^{t \times (p-t+1)}\) whose columns are length-t substrings of \(\mathbf {w}\), and matrix \(\mathbf {S} \in \{0,1\}^{t \times k}\) whose columns are the keywords \(\mathbf {s}_i\), then \(\mathbf {w}\) is legitimate if and only if there exist weight-1 vectors \(\mathbf {g} \in \{0,1\}^{p-t+1}\) and \(\mathbf {h} \in \{0,1\}^k\) such that \(\mathbf {W} \cdot \mathbf {g} = \mathbf {S} \cdot \mathbf {h}\). Then, to handle this relation, we employ Stern’s permuting technique [49] to prove knowledge of such \(\mathbf {g}, \mathbf {h}\) and we adapt Libert et al.’s technique [32] for proving the well-formedness of the quadratic term \(\mathbf {W} \cdot \mathbf {g}\).

As for the prohibitive policy, we consider all the \((p-t+1)\cdot k\) sums \(\mathbf {z}_{i, j} \in \{0,1\}^t\) over \(\mathbb {Z}_2\) of substrings of \(\mathbf {w}\) and keywords in S. Then, \(\mathbf {w}\) is legitimate if and only if all these sums have Hamming weight at least d. To prove these statements, we perform the following extension trick, inspired by [37].

We append \((t-d)\) coordinates to \(\mathbf {z}_{i,j} \in \{0,1\}^t\) to get \(\mathbf {z}^{*}_{i, j} \in \{0,1\}^{2t-d}\) with Hamming weight exactly t. Such an extension is always possible if \(\mathbf {z}_{i,j}\) has weight at least d. Furthermore, the converse also holds: if \(\mathbf {z}^{*}_{i,j}\) has weight t, then the original \(\mathbf {z}_{i,j}\) must have weight at least \(t - (t-d) = d\). At this point, it suffices to use Stern’s permuting technique [49] for proving knowledge of fixed-weight binary vectors.

The techniques sketched above can be smoothly integrated into our code-based instantiation of FDGE.

Code-Based Instantiation. To design a scheme satisfying our model of FDGE, we would need: (1) An anonymous CCA2-secure public-key encryption to encrypt messages under a group user’s public key and to encrypt the user’s public key under the OA’s public key; (2) A secure digital signature to certify public keys of group members; and (3) Zero-knowledge proofs compatible with the encryption and signature layers, as well as with the message filtering layer.

In the code-based setting, the first ingredient can be obtained from the randomized McEliece encryption scheme [46] that satisfies CPA-security and the Naor-Yung transformation [44]. The second ingredient seems not readily available, as code-based signatures for which there are efficient zero-knowledge proofs of knowledge of message/signature pairs are not known to date. To tackle this issue, we adapt the strategy of Ling et al. in their construction of lattice-based fully dynamic group signatures [38]. This amounts to replacing the signature scheme by an accumulator scheme [6] equipped with zero-knowledge arguments of membership. We hence can make use of the code-based realization of Merkle-tree acummulators recently proposed by Nguyen et al. [45].

The main idea is to use Merkle-tree accumulators to certify users’ public key. Let \(N = 2^\ell \) be the maximum expected number of group users. Let \(\mathsf {pk} = (\mathbf {G}_0, \mathbf {G}_1)\) be a user public key, where \(\mathbf {G}_0, \mathbf {G}_1\) are 2 McEliece encryption matrices (recall that we employ the Naor-Yung double encryption technique). Then \(\mathsf {pk}\) is hashed to a vector \(\mathbf {d} \ne \mathbf {0}\), which is placed at the tree leaf corresponding to the identity \(j \in \{0, 1\}^\ell \) of the user in the group. A tree root is then computed based on all the \(2^\ell \) leaves. The user’s certificate, which is made available to message senders, consists of \(\mathsf {pk}\), j and hash values in the path from her leaf to the root.

When sending a message \(\mathbf {w}\) satisfying “permissive” or “prohibitive” policy to user j, the sender uses \(\mathsf {pk}\) to encrypt \(\mathbf {w}\) as \(\mathbf {c}_w\), and uses the OA’s public key to encrypt j as \(\mathbf {c}_\mathsf{oa}\), so that OA can recover j if necessary. As for well-formedness of ciphertext, sender proves in zero-knowledge that:

  1. 1.

    \(\mathbf {w}\) satisfies the given policy. This can be done using the discussed techniques.

  2. 2.

    \(\mathbf {c}_\mathsf{oa}\) is an honestly computed ciphertext of j. This part is quite straightforward to realize via techniques for Stern’s protocol.

  3. 3.

    \(\mathbf {c}_w\) is a correct ciphertext of the \(\mathbf {w}\) from (1.), computed under some hidden public key \(\mathsf {pk}\), whose hash value \(\mathbf {d} \ne \mathbf {0}\) is at the tree leaf corresponding to the j from (2.). This is indeed the most sophisticated portion of our scheme. It requires to demonstrate: (i) membership of \(\mathbf {d}\) in the tree and \(\mathbf {d} \ne \mathbf {0}\) is the hash of value of \(\mathsf {pk}\); (ii) \(\mathbf {c}_w\) has the form \(\mathbf {c}_w = \mathsf {pk} \cdot \begin{bmatrix} \mathbf {r} \\ \mathbf {w} \end{bmatrix} + \mathbf {e}\), where \((\mathbf {r}, \mathbf {e})\) is the encryption randomness.

While statement (i) can be handled using the techniques from [37, 45], (ii) would require to prove an Learning-Parity-with-Noise-like relation with hidden-but-certified matrix \(\mathsf {pk}\). We then tackle this problem by adapting the techniques for Learning-with-Errors relations [47] from [32] into the binary setting.

Having discussed the main technical ingredients of the scheme, let us now explain how user revocations and dynamic user enrolments can be done in a simple manner based on Merkle trees. The ideas, first suggested in [38], are as follows. At the setup phase, all leaves in the tree are set as \(\mathbf {0}\). When a new user joins the group, as mentioned, \(\mathbf {0}\) is changed to \(\mathbf {d} \ne \mathbf {0}\). If the user is later revoked from the group, the value is set back to \(\mathbf {0}\). For each change, the GM can efficiently update the tree by re-computing the path in time \(\mathcal {O}(\log N)\). Note that in the zero-knowledge layer above, the sender in part proves that \(\mathbf {d}\) is non-zero - which is interpreted as “the sender is indeed an active group user”.

Putting everything together, we obtain the first construction of code-based (fully dynamic) GE. In the random oracle model, we prove that the scheme satisfies all the stringent security notions of FDGE, namely, message secrecy, anonymity and soundness, based on the security of the code-based technical ingredients we employ.

The scheme, however, should only be viewed as a proof-of-concept, as it is not practical - due to the involvement of heavy zero-knowledge arguments. However, in comparison with [32] the only known GE scheme from post-quantum assumptions, ours is more efficient. The main reason is that ours uses a Merkle tree - which can be viewed as a weak form of signatures, while theirs relies on a standard-model lattice-based signature scheme, whose supported zero-knowledge arguments incurred an overhead factor of \(\log ^2 q\), where \(q > 2^{30}\). We estimate that, for 128 bits of security, our argument size is about 2 orders of magnitude smaller than theirs. In other words, our scheme is more efficient than [32], but is still not practical. We leave the problem of obtaining practically usable FDGE schemes from post-quantum assumptions as an interesting open question.

Other related Work. Enabling efficient user revocations in advanced privacy-preserving cryptographic constructions is generally a challenging problem, since one has to ensure that revoked users are no longer able to act as active users, and the workloads of other parties (managers, non-revoked users, verifiers) do not significantly increase in the meantime. In the context of group signatures, several different approaches have been suggested [9, 13, 14] to address this problem, and efficient pairing-based constructions supporting both dynamic joining and efficient revocation were given in [34, 35, 43]. Bootle et al. [10] pointed out a few shortcomings of previous models, and put forward robust security notions for fully dynamic group signatures. Here, we adapt the [10] model to provide the first formal treatment of user revocation in the context of GE.

The major tools for building those privacy-preserving constructions are zero-knowledge (\(\mathsf {ZK}\)) proof [24] and argument [12, 23] systems that allow to prove the truth of a statement while revealing no additional information. Almost all known zero-knowledge proof/argument systems used in code-based cryptography follow Stern’s framework [49]. Variants of Stern’s protocol have been employed to design privacy-preserving constructions, such as proofs of plaintext knowledge [42], linear-size ring signatures [11, 18, 40, 41], linear-size and sublinear-size group signatures [2, 20], proofs of valid openings for commitments and proofs for general relations [27]. Recently, Nguyen et al. [45] proposed a number of new code-based privacy-preserving protocols, including accumulators, range proofs, logarithmic-size ring signatures and group signatures. However, prior to our work, no construction of code-based GE was known.

Organization. The rest of the paper is organized as follows. In Sect. 2, we recall the background on Stern-like protocols and previous techniques for designing zero-knowledge protocols in Stern’s framework. In Sect. 3, we present our \(\mathsf {ZK}\) argument for a quadratic relation. This is crucial for proving the permissive relation in Sect. 4 - where we also present the strategies for proving the prohibitive relation. Section 5 introduces the model and security requirements of \(\mathsf {FDGE}\). Next, we present our code-based instantiation of FDGE in Sect. 6. Due to space limit, several supporting materials are deferred to the full version of the paper [51].

2 Preliminaries

Notations. Let \(a,b\in \mathbb {Z}\). Denote [ab] as the set \(\{a, \ldots , b\}\). We simply write [b] when \(a=1\). Let \(\oplus \) denote the bit-wise addition operation modulo 2. If S is a finite set, then \(x \xleftarrow {\$} S\) means that x is chosen uniformly at random from S. Throughout this paper, all vectors are column vectors. When concatenating vectors \(\mathbf {x}\in \{0,1\}^m\) and \(\mathbf {y}\in \{0,1\}^k\), for simplicity, we use the notation \((\mathbf {x}\Vert \mathbf {y})\in \{0,1\}^{m+k}\) instead of \((\mathbf {x}^\top \Vert \mathbf {y}^\top )^\top \). The Hamming weight of vector \(\mathbf {x} \in \{0,1\}^m\) is denoted by \(wt(\mathbf {x})\). The Hamming distance between vectors \(\mathbf {x}, \mathbf {y} \in \{0,1\}^m\) is denoted by \(d_H(\mathbf {x}, \mathbf {y})\), and is equal to \(wt(\mathbf {x} \oplus \mathbf {y})\). Denote by \(\mathsf {B}(n,{\omega })\) the set of all binary vectors of length n with Hamming weight \(\omega \), and by \(\mathsf {S}_n\) the symmetric group of all permutations of n elements.

2.1 Stern-Like Protocols

The statistical zero-knowledge arguments of knowledge presented in this work are Stern-like [49] protocols. In particular, they are \(\varSigma \)-protocols in the generalized sense defined in [7, 27] (where 3 valid transcripts are needed for extraction, instead of just 2). The basic protocol consists of 3 moves: commitment, challenge, response. If a statistically hiding and computationally binding string commitment is employed in the first move, then one obtains a statistical zero-knowledge argument of knowledge (ZKAoK) with perfect completeness, constant soundness error 2/3. In many applications, the protocol is repeated a sufficient number of times to make the soundness error negligibly small. For instance, to achieve soundness error \(2^{-80}\), it suffices to repeat the basic protocol 137 times.

An abstraction of Stern’s protocols. We recall an abstraction, adapted from [31], which captures the sufficient conditions to run a Stern-like protocol. Looking ahead, this abstraction will be helpful for us in presenting our ZK argument systems: we will reduce the relations we need to prove to instances of the abstract protocol, using our specific techniques. Let KL be positive integers, where \(L\ge K\), and let \(\mathsf {VALID}\) be a subset of \(\{0,1\}^L\). Suppose that \(\mathcal {S}\) is a finite set such that one can associate every \(\phi \in \mathcal {S}\) with a permutation \(\varGamma _\phi \) of L elements, satisfying the following conditions:

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {w} \in \mathsf {VALID} \Longleftrightarrow \varGamma _\phi (\mathbf {w}) \in \mathsf {VALID}, \\ \text {If } \mathbf {w} \in \mathsf {VALID} \text { and } \phi \text { is uniform in } \mathcal {S}, \text { then } \varGamma _\phi (\mathbf {w}) \text { is uniform in } \mathsf {VALID}. \end{array}\right. } \end{aligned}$$
(1)

We aim to construct a statistical ZKAoK for the following abstract relation:

$$\begin{aligned} \mathrm {R_{abstract}} = \big \{\big ((\mathbf {M}, \mathbf {v});\mathbf {w}\big ) \in \mathbb {Z}_2^{K \times L} \times \mathbb {Z}_2^K \times \mathsf {VALID}: \mathbf {M}\cdot \mathbf {w} = \mathbf {v}\big \}. \end{aligned}$$

The conditions in (1) play a crucial role in proving in ZK that \(\mathbf {w} \in \mathsf {VALID}\): To do so, the prover samples \(\phi \xleftarrow {\$}\mathcal {S}\) and lets the verifier check that \(\varGamma _\phi (\mathbf {w}) \in \mathsf {VALID}\), while the latter cannot learn any additional information about \(\mathbf {w}\) thanks to the randomness of \(\phi \). Furthermore, to prove in ZK that the linear equation holds, the prover samples a masking vector \(\mathbf {r}_w \xleftarrow {\$} \mathbb {Z}_2^L\), and convinces the verifier instead that \(\mathbf {M}\cdot (\mathbf {w} \oplus \mathbf {r}_w) = \mathbf {M}\cdot \mathbf {r}_w \oplus \mathbf {v}.\)

The interaction between prover \(\mathcal {P}\) and verifier \(\mathcal {V}\) can be found in [31] or the full version of the paper. The resulting protocol is a statistical \(\mathsf {ZKAoK}\) with perfect completeness, soundness error 2/3, and communication cost \(\mathcal {O}(L)\).

2.2 Previous Extension and Permutation Techniques

In this section, we first recall the permutation technique that is designed to prove knowledge of a binary vector of fixed hamming weight, which originates from Stern [49].

Technique for Handling Binary Vector with Fixed Hamming Weight. For any \(\mathbf {e}\in \mathsf {B}(n,\omega )\) and \(\sigma \in \mathsf {S}_{n}\), it is easy to see that the following equivalence holds Footnote 3.

$$\begin{aligned} \mathbf {e}\in \mathsf {B}(n,\omega ) \Longleftrightarrow \sigma (\mathbf {e})\in \mathsf {B}(n,\omega ), \end{aligned}$$
(2)

To show that the vector \(\mathbf {e}\) has hamming weight \(\omega \), the prover samples a uniformly random permutation \(\sigma \in \mathsf {S}_{n}\) and shows the verifier that \(\sigma (\mathbf {e})\in \mathsf {B}(n,\omega )\). Due to the above equivalence (2), the verifier should be convinced that \(\mathbf {e}\in \mathsf {B}(n,\omega )\). Furthermore, \(\sigma (\mathbf {e})\) reveal no information about \(\mathbf {e}\) due to the uniformity of \(\sigma \).

The above technique was later developed to prove various forms of secret vectors. We now review the extension and permutation techniques for proving the knowledge of arbitrary binary vectors, which were presented in [33].

Let \(\oplus \) denote the bit-wise addition operation modulo 2. For any bit \(b \in \{0,1\}\), denote by \(\overline{b}\) the bit \(b = b\oplus 1\). Note that, for any \(b,d \in \{0,1\}\), we have \(\overline{b\oplus d} = b \oplus d \oplus 1 = \overline{b}\oplus d\).

Techniques for Handling Arbitrary Binary Vectors. To prove the knowledge of a binary vector \(\mathbf {x}\in \{0, 1\}^n\), define the extension process and permutation as follows.

  • For a binary vector \(\mathbf {x} = [x_1| \ldots | x_n]^{\top }\in \{0,1\}^n\), where \(n \in \mathbb {Z}^+\), denote by \(\mathsf {Encode}(\mathbf {x})\) the vector \([\overline{x}_1| x_1| \ldots | \overline{x}_n| x_n]^{\top } \in \{0,1\}^{2n}\).

  • Let \(\mathbf {I}_n^*\in \mathbb {Z}_2^{n \times 2n}\) be an extension of the identity matrix \(\mathbf {I}_n\), obtained by inserting a zero-column \(\mathbf {0}^n\) right before each column of \(\mathbf {I}_n\). We have for \(\mathbf {x} \in \{0,1\}^n\),

    $$\begin{aligned} \mathbf {x} = \mathbf {I}_n^* \cdot \mathsf {Encode}(\mathbf {x}). \end{aligned}$$
    (3)
  • For \(\mathbf {b}= [b_1| \ldots | b_n]^\top \in \{0,1\}^n\), define the permutation \(F_{\mathbf {b}}\) that transforms vector \(\mathbf {z} = [z_{1,0}| z_{1,1}| \ldots | z_{n,0}| z_{n,1}]^\top \in \{0,1\}^{2n}\) into:

    $$ F_{\mathbf {b}}(\mathbf {z}) = [z_{1,b_1}| z_{1,\overline{b}_1}| \ldots | z_{n, b_n}| z_{n,\overline{b}_n}]^\top . $$

    Note that, for any \(\mathbf {b}, \mathbf {x} \in \{0,1\}^n\), we have:

    $$\begin{aligned} \mathbf {z} = \mathsf {Encode}(\mathbf {x}) \Longleftrightarrow F_{\mathbf {b}}(\mathbf {z}) = \mathsf {Encode}(\mathbf {x} \oplus \mathbf {b}). \end{aligned}$$
    (4)

The above equivalence (4) is useful in the Stern’s framework [49] for proving knowledge of binary witness-vectors. Towards the goal, one encodes \(\mathbf {x}\) to \(\mathbf {z}=\mathsf {Encode}(\mathbf {x})\), samples a random binary vector \(\mathbf {b}\) and permutes \(\mathbf {z}\) using \(F_{\mathbf {b}}\). Then one demonstrates to the verifier that the permuted vector \(F_{\mathbf {b}}(\mathbf {z})\) is of the correct form \(\mathsf {Encode}(\mathbf {x} \oplus \mathbf {b})\). Due to (4), the verifier should be convinced that \(\mathbf {z}\) is well formed, which further implies the knowledge of a binary vector \(\mathbf {x}\). Meanwhile, vector \(\mathbf {b}\) serves as a “one-time pad” that perfects hides \(\mathbf {x}\). In addition, if we have to show that \(\mathbf {x}\) appears somewhere else, we can use the same \(\mathbf {b}\) at those places.

3 Zero-Knowledge Arguments for Quadratic Relations

In this section, we present our \(\mathsf {ZKAoK}\) for quadratic relations. More concretely, our arguments demonstrate that a given value \(\mathbf {c}\) is an honest evaluation of the form \(\mathbf {A}\cdot \mathbf {r}\oplus \mathbf {e}\), where \(\mathbf {A},\mathbf {r},\mathbf {e}\) are all secret and may satisfy other constraints. In the following, we present our \(\mathsf {ZKAoK}\) for a variant of \(\mathsf {LPN}\) relation, where we consider secret \(\mathbf {A}\in \mathbb {Z}_2^{n\times m},\mathbf {r}\in \mathsf {B}(m,t_r),\mathbf {e}\in \mathsf {B}(n,t)\). Looking ahead, this protocol is crucial in Sect. 4.2 that proves a message satisfies the permissive relation.

3.1 Proving a Variant of LPN Relation with Hidden Matrix

Let \(n,m,t,t_r\) be positive integers, \(\mathbf {A}\in \mathbb {Z}_2^{n\times m}\), \(\mathbf {r}\in \mathsf {B}(m,t_r)\), \(\mathbf {e}\in \mathsf {B}(n,t), \mathbf {c}\in \mathbb {Z}_2^n\). We now present our \(\mathsf {ZKAoK}\) that allows \(\mathcal {P}\) to prove its knowledge of \(\mathbf {A},\mathbf {r},\mathbf {e}\) such that \(\mathbf {c}=\mathbf {A}\cdot \mathbf {r}\oplus \mathbf {e}\). The associated relation is defined as follows:

$$\begin{aligned} R_{\mathrm {VLPN}}=\Big \{ \Big (\mathbf {c}; (\mathbf {A},\mathbf {r},\mathbf {e})\Big ) \in \mathbb {Z}_2^n\times \Big (\mathbb {Z}_2^{n\times m}\times \mathsf {B}(m,t_r)\times \mathsf {B}(n,t)\Big ): \mathbf {c}=\mathbf {A}\cdot \mathbf {r}\oplus \mathbf {e} \Big \}. \end{aligned}$$

To prove \(\mathbf {r}\) has fixed hamming, we introduce the following Hadamard product extension and extended matrix-vector product expansion and their corresponding permutations.

Hadamard Product Extension. Let vectors \(\mathbf {a}\in \{0,1\}^m\), \(\mathbf {r}\in \mathsf {B}(m,t_r)\) and \(\mathbf {c}=[a_1\cdot r_1|a_2\cdot r_2|\cdots |a_{m}\cdot r_m]^{\top }\). The goal is to prove the well-formedness of \(\mathbf {c}\), i.e., \(\mathbf {c}\) is a Hadamard product of two binary vectors, one of which has fixed hamming weight \(t_r\). We therefore introduce the following extension and permutation.

  • Define extension of \(c_i=a_i\cdot r_i\) as \(\mathsf {ext}'(c_i){\mathop {=}\limits ^{\triangle }}\mathsf {ext}'(a_i,r_i)=[\overline{a_i}\cdot r_i|a_i\cdot r_i]^{\top }\in \{0,1\}^2\). Let \(\mathbf {h}'=[0|1]\), then we obtain \(c_i=\mathbf {h}'\cdot \mathsf {ext}'(c_i)\).

  • Define the extension of \(\mathbf {c}\) to be vector of the form

    $$\mathsf {ext}'(\mathbf {a},\mathbf {r})=[\overline{a_1}\cdot r_1|a_1\cdot r_1|\overline{a_2}\cdot r_2|a_2\cdot r_2|\cdots |\overline{a_m}\cdot r_m|a_m\cdot r_m]^{\top } \in \{0,1\}^{2m}.$$
  • For any \(\mathbf {b}=[b_1|b_2|\cdots |b_m]^{\top }\in \{0,1\}^m\), \(\sigma \in \mathsf {S}_m\), define permutation \(\varPsi _{\mathbf {b},\sigma }\) that transforms a vector

    $$\mathbf {z}=[z_1^{(0)}|z_1^{(1)}|z_2^{(0)}|z_2^{(1)}|\cdots |z_m^{(0)}|z_m^{(1)}]^{\top }\in \mathbb {Z}^{2m}$$

    to a vector

    $$\begin{aligned} \varPsi _{\mathbf {b},\sigma }(\mathbf {z})=[z_{\sigma (1)}^{(b_{\sigma (1)})}|z_{\sigma (1)}^{(\overline{b_{\sigma (1)}})}|z_{\sigma (2)}^{(b_{\sigma (2)})}|z_{\sigma (2)}^{(\overline{b_{\sigma (2)}})}|\cdots |z_{\sigma (m)}^{(b_{\sigma (m)})}|z_{\sigma (m)}^{(\overline{b_{\sigma (m)}})}]^{\top }. \end{aligned}$$
  • For any \(\mathbf {a},\mathbf {b}\in \{0,1\}^{m}\), \(\mathbf {r}\in \mathsf {B}(m,t_r)\), \(\sigma \in \mathsf {S}_m\), it is verifiable that the following equivalence holds.

    $$\begin{aligned} \mathbf {z}=\mathsf {ext}'(\mathbf {a},\mathbf {r}) \Longleftrightarrow \varPsi _{\mathbf {b},\sigma }(\mathbf {z}) =\mathsf {ext}'\big (\sigma (\mathbf {a}\oplus \mathbf {b}),\sigma (\mathbf {r})\big ). \end{aligned}$$
    (5)

Example. Let \(m=4,t_r=2\), \(\mathbf {a}=[1|1|0|1]^{\top }\), \(\mathbf {b}=[0|1|0|1]^{\top }\), \(\mathbf {r}=[1|0|0|1]^{\top }\), \(\sigma (i)=i+1\) for \(i\in [3]\) and \(\sigma (4)=1\). We have \(\mathbf {d}=\sigma (\mathbf {a}\oplus \mathbf {b})=[0|0|0|1]^{\top }\), \(\mathbf {e}=\sigma (\mathbf {r})=[0|0|1|1]^{\top }\), and

Extended Matrix-Vector Product Expansion. Let vectors \(\mathbf {a},\mathbf {r}\) be of the form \(\mathbf {a}=[a_{1,1}|\cdots |a_{1,n}|\cdots |a_{m,1}|\cdots |a_{m,n}]^{\top }\in \mathbb {Z}_2^{mn}\) and \(\mathbf {r}=[r_1|\cdots |r_m]^{\top }\in \mathsf {B}(m,t_r)\), and \(\mathbf {c}\in \mathbb {Z}_2^{mn}\) be of the form

$$\mathbf {c}=[a_{1,1}\cdot r_1|\cdots |a_{1,n}\cdot r_1|a_{2,1}\cdot r_2|\cdots |a_{2,n}\cdot r_2|\cdots |a_{m,1}\cdot r_m|\cdots |a_{m,n}\cdot r_m]^{\top }.$$

We now present the techniques to show the well-formedness of \(\mathbf {c}\).

Define extension of \(\mathbf {c}\) to be a vector \(\mathsf {expand}'(\mathbf {a},\mathbf {r})\in \mathbb {Z}_2^{2mn}\) of the form:

Now for \(\mathbf {b}=[b_{1,1}|\cdots |b_{1,n}|b_{2,1}|\cdots |b_{2,n}|\cdots |b_{m,1}|\cdots | b_{m,n}]^{\top }\in \mathbb {Z}_2^{mn}\) and \(\sigma \in \mathsf {S}_m\), we define \(\varPsi '_{\mathbf {b},\sigma }\) that transform vector \(\mathbf {z}\in \{0,1\}^{2mn}\) of the following form

$$\begin{aligned} \mathbf {z}=\big [&z_{1,1}^{(0)}|z_{1,1}^{(1)} |z_{1,2}^{(0)}|z_{1,2}^{(1)}| \cdots |z_{1,n}^{(0)}|z_{1,n}^{(1)}| \\&z_{2,1}^{(0)}|z_{2,1}^{(1)} |z_{2,2}^{(0)}|z_{2,2}^{(1)} | \cdots | z_{2,n}^{(0)}|z_{2,n}^{(1)}| \cdots | \\&z_{m,1}^{(0)}|z_{m,1}^{(1)} |z_{m,2}^{(0)}|z_{m,2}^{(1)} | \cdots |z_{m,n}^{(0)}|z_{m,n}^{(1)} \big ] \end{aligned}$$

to vector \(\varPsi '_{\mathbf {b},\sigma }\) of the following form

$$\begin{aligned} \varPsi '_{\mathbf {b}, \sigma }(\mathbf {z})=\big [&y_{1,1}^{(0)}|y_{1,1}^{(1)} |y_{1,2}^{(0)}|y_{1,2}^{(1)}| \cdots |y_{1,n}^{(0)}|y_{1,n}^{(1)}| \\&y_{2,1}^{(0)}|y_{2,1}^{(1)} |y_{2,2}^{(0)}|y_{2,2}^{(1)} | \cdots | y_{2,n}^{(0)}|y_{2,n}^{(1)}| \cdots | \\&y_{m,1}^{(0)}|y_{m,1}^{(1)} |y_{m,2}^{(0)}|y_{m,2}^{(1)} | \cdots |y_{m,n}^{(0)}|y_{m,n}^{(1)} \big ] \end{aligned}$$

such that \(y_{i,j}^{(0)}=z_{\sigma (i),j}^{(b_{\sigma (i),j})}\) and \(y_{i,j}^{(1)}=z_{\sigma (i),j}^{(\overline{b_{\sigma (i),j}})}\) for \(i\in [n],j\in [m]\). For ease of notation, given \(\mathbf {f}=(\mathbf {f}_1\Vert \cdots \Vert \mathbf {f}_m)\in \{0,1\}^{mn}\), where each \(\mathbf {f}_i\in \{0,1\}^n\), and \(\sigma \in \mathsf {S}_m\), define

$$\sigma ^{(n)}(\mathbf {f})=(\mathbf {f}_{\sigma (1)}\Vert \cdots \Vert \mathbf {f}_{\sigma (m)}).$$

Precisely, \(\sigma ^{(n)}\) permutes the blocks of \(\mathbf {f}\) using \(\sigma \). The following equivalence then immediately follows from (5) for \(\mathbf {a},\mathbf {b}\in \{0,1\}^{mn}\), \(\mathbf {r}\in \mathsf {B}(m,t_r)\), \(\sigma _r\in \mathsf {S}_m\).

$$\begin{aligned} \mathbf {z}=\mathsf {expand}'(\mathbf {a},\mathbf {r}) \Longleftrightarrow \varPsi '_{\mathbf {b},\sigma _r}(\mathbf {z})=\mathsf {expand}'\big (\sigma _r^{(n)}(\mathbf {a}\oplus \mathbf {b}),\sigma _r(\mathbf {r})\big ). \end{aligned}$$
(6)

The Zero-Knowledge Argument. We now transform the relation \(R_{\mathrm {VLPN}} \) to an instance of \(R_{\mathrm {abstract}}\) such that the equivalences in (1) hold. Write \(\mathbf {A}=[\mathbf {a}_1|\cdots |\mathbf {a}_m]\in \mathbb {Z}^{n\times m}_2\) and \(\mathbf {r}=[r_1| \cdots | r_m]^{\top }\in \mathbb {Z}_2^m\), then we have

where and \(\mathbf {z}_i=(\mathsf {ext}'(a_{i,1},r_i)\Vert \cdots \Vert \mathsf {ext}'(a_{i,n},r_i))\in \mathbb {Z}_2^{2n}\). Denote \(\mathbf {H}_{n,m}'=\underbrace{[\mathbf {H}_{n,1}'|\cdots |\mathbf {H}_{n,1}']}_{m~\text {times}}\in \mathbb {Z}_2^{n\times 2mn}\), \(\mathbf {z}=(\mathbf {z}_1\Vert \cdots \Vert \mathbf {z}_m)\in \mathbb {Z}_2^{2mn}\), and \(\mathbf {a}=[a_{1,1}|\cdots |a_{1,n}|a_{2,1}|\cdots |a_{2,n}|\cdots |a_{m,1}|\cdots |a_{m,n}]^{\top }\in \mathbb {Z}_2^{mn}\). Then \(\mathbf {z}\) is indeed the extended expansion vector of \(\mathbf {a}\) and \(\mathbf {r}\), i.e., \(\mathbf {z}=\mathsf {expand}'(\mathbf {a},\mathbf {r})\). If no ambiguity caused, we write \(\mathbf {z}=\mathsf {expand}'(\mathbf {A},\mathbf {r})\). Hence, we obtain the following:

$$\begin{aligned} \mathbf {c}=\mathbf {A}\cdot \mathbf {r}\oplus \mathbf {e}\Longleftrightarrow \mathbf {c}=\mathbf {H}_{n,m}'\cdot \mathsf {expand}'(\mathbf {A},\mathbf {r})\oplus \mathbf {e}. \end{aligned}$$
(7)

Denote \(\mathbf {M}_{\mathrm {VLPN}}=[\mathbf {H}_{n,m}'|\mathbf {I}_n]\in \mathbb {Z}_2^{n\times L_{\mathrm {VLPN}}}\) and \(\mathbf {w}_{\mathrm {VLPN}}=(\mathsf {expand}'(\mathbf {A},\mathbf {r})\Vert \mathbf {e})\in \mathbb {Z}_2^{L_{\mathrm {VLPN}}}\) with \(L_{\mathrm {VLPN}}=2mn+n\). Hence \(\mathbf {c}{\mathop {=}\limits ^{\triangle }}\mathbf {v}_{\mathrm {VLPN}}=\mathbf {M}_{\mathrm {VLPN}}\cdot \mathbf {w}_{\mathrm {VLPN}}\bmod 2\).

Now we are ready to specify the set \(\mathsf {VALID}_{\mathrm {VLPN}}\) that contains of secret vector \(\mathbf {w}_{\mathrm {VLPN}}\), the set \(\mathcal {S}_{\mathrm {VLPN}}\), and permutations \(\{\varGamma _{\phi }:\phi \in \mathcal {S}_{\mathrm {VLPN}}\}\) such that the equivalences in (1) hold. To this end, let \(\mathsf {VALID}_{\mathrm {VLPN}}\) contain all vectors \(\widehat{\mathbf {w}}_{\mathrm {VLPN}}=( \widehat{\mathbf {z}}\Vert \widehat{\mathbf {e}})\in \mathbb {Z}_2^{2mn+n}\) satisfying the following constraints:

  • There exists \(\widehat{\mathbf {a}}\in \mathbb {Z}_2^{nm}\) and \(\widehat{\mathbf {r}}\in \mathsf {B}(m,t_r)\) such that \(\widehat{\mathbf {z}}=\mathsf {expand}'(\widehat{\mathbf {a}},\widehat{\mathbf {r}})\).

  • \(\widehat{\mathbf {e}}\in \mathsf {B}(n,t)\).

It is easy to that the secret vector \(\mathbf {w}_{\mathrm {VLPN}}\) belongs to \(\mathsf {VALID}_{\mathrm {VLPN}}\). Let \(\mathcal {S}_{\mathrm {VLPN}}=\{0,1\}^{mn}\times \mathsf {S}_m\times \mathsf {S}_n\). Then for each \(\phi =(\mathbf {b},\sigma _{r},\sigma _e)\in \mathcal {S}_{\mathrm {VLPN}},\) define the permutation \(\varGamma _{\phi }\) that transforms vector of the form \(\widehat{\mathbf {w}}_{\mathrm {VLPN}}=(\widehat{\mathbf {z}}\Vert \widehat{\mathbf {e}})\) with \(\widehat{\mathbf {z}}\in \mathbb {Z}_2^{2mn},\widehat{\mathbf {e}}\in \mathbb {Z}_2^n \) to vector \(\varGamma _{\phi }(\widehat{\mathbf {w}}_{\mathrm {VLPN}})=(\varPsi '_{\mathbf {b},\sigma _{r}}(\widehat{\mathbf {z}})\Vert \sigma _e(\widehat{\mathbf {e}}))\).

Based on the equivalence observed in (6) and (2), it can be checked that the conditions in (1) are satisfied and we have successfully reduced the consider relation \(R_{\mathrm {VLPN}}\) to an instance of \(R_{\mathrm {abstract}}\). Now \(\mathcal {P}\) and \(\mathcal {V}\) can run the Stern-like protocol for the reduced statement \(R_{\mathrm {VLPN}}\) (see the full version). The resulting protocol is a statistical \(\mathsf {ZKAoK}\) with perfect completeness, soundness error 2/3, and communication cost \(\mathcal {O}(L_{\mathrm {VLPN}})=\mathcal {O}(mn)=\mathcal {O}(\lambda ^2)\) bits.

4 Message Filtering in Zero-Knowledge

In this section, we first specify the 2 policies we use for filtering messages encrypted in the code-based FDGE scheme of Sect. 6. Then we discuss our main ideas for proving in \(\mathsf {ZK}\) that the underlying messages satisfy the given policies.

4.1 Formulation

Let p, t, \(d \in \mathbb {Z}^+\) such that \(p> t > d\). A string \(\mathbf {y} = [y_1| \cdots | y_t]^{\top } \in \{0,1\}^t\) is called a substring of string \(\mathbf {w} = [w_1| \cdots | w_p]^{\top } \in \{0,1\}^p\), denoted as \(\mathbf {y} \sqsubset \mathbf {w}\), if there exists an integer \(i \in [1, p-t+1]\) such that \(y_j = w_{i+j-1}\) for all \(j \in [1,t]\). The Hamming distance between \(\mathbf {x}, \mathbf {y} \in \{0,1\}^t\), denoted by \(d_H(\mathbf {x}, \mathbf {y})\), is the number of coordinates at which \(\mathbf {x}\) and \(\mathbf {y}\) differ. In other words, \(d_H(\mathbf {x}, \mathbf {y}) = wt(\mathbf {x} \oplus \mathbf {y})\).

Let \(\mathbf {w} \in \{0,1\}^p\) be an encrypted message and let \(S = \{\mathbf {s}_1, \ldots , \mathbf {s}_k\}\) be a given list of \(k \ge 1\) keywords, where \(\mathbf {s}_i \in \{0,1\}^t\), for all \(i \in [1,k]\). We will realize 2 commonly used policies of message filtering.

  1. 1.

    “Permissive”: \(\mathbf {w}\) is a legitimate message if and only if there exists \(i \in [1,k]\) such that \(\mathbf {s}_i\) is a substring of \(\mathbf {w}\). The induced relation \(R_\mathsf{permit}\) is defined as

    $$\begin{aligned} R_{\mathrm {permit}} = \big \{\big ((\mathbf {s}_1, \ldots , \mathbf {s}_k), \mathbf {w}\big ) \in (\{0,1\}^t)^k \times \{0,1\}^p: ~\exists i \in [1,k]~\text { s.t. } \mathbf {s}_i \sqsubset \mathbf {w} \big \}. \end{aligned}$$
    (8)
  2. 2.

    “Prohibitive”: \(\mathbf {w}\) is a legitimate message if and only if for every length-t substring \(\mathbf {y}\) of \(\mathbf {w}\) and every \(\mathbf {s}_i \in S\), their Hamming distance is at least d. The corresponding relation \(R_\mathsf{prohibit}\) is defined as

    $$\begin{aligned} R_{\mathrm {prohibit}}&= \big \{\big ((\mathbf {s}_1, \ldots , \mathbf {s}_k), \mathbf {w}\big ) \in (\{0,1\}^t)^k \times \{0,1\}^p: \nonumber \\&d_H(\mathbf {s}_i, \mathbf {y}) \ge d, \forall i \in [1,k], \forall \mathbf {y} \sqsubset \mathbf {w} \big \}. \end{aligned}$$
    (9)

In the following, we will discuss our strategies for proving that message \(\mathbf {w}\) satisfies each of the above policies.

4.2 Zero-Knowledge for the Permissive and Prohibitive Relations

Let \(\mathbf {w} = [w_1| \cdots | w_p]^{\top }\), and for each \(i \in [p-t+1]\), let \(\mathbf {w}_{[i]} = [w_i|\cdots | w_{i+t-1}]^{\top }\) be its i-th substring of length t. Our ideas for proving that \(\big ((\mathbf {s}_1, \ldots , \mathbf {s}_k), \mathbf {w}\big ) \in R_\mathsf{permit}\) in \(\mathsf {ZK}\) are as follows. First, we form matrices

$$\begin{aligned} \mathbf {W} = [\mathbf {w}_{[1]} \mid \cdots \mid \mathbf {w}_{[p-t+1]}] = \begin{bmatrix} w_1 &{} w_2 &{} \cdots &{} w_{p-t+1} \\ w_2 &{} w_3 &{} \cdots &{} w_{p-t+2} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ w_t &{} w_{t+1} &{} \cdots &{} w_p \end{bmatrix} \in \{0,1\}^{t \times (p-t+1)}, \end{aligned}$$

\(\mathbf {S} = [\mathbf {s}_1 \mid \cdots \mid \mathbf {s}_k] \in \{0,1\}^{t \times k}\), and denote \(\mathsf {permit}(\mathbf {w})=(\mathbf {w}_{[1]}\Vert \cdots \Vert \mathbf {w}_{[p-t+1]})\in \{0,1\}^{t(p-t+1)}\). We note that \(\big ((\mathbf {s}_1, \ldots , \mathbf {s}_k), \mathbf {w}\big ) \in R_{\mathrm {permit}}\) if and only if there exist a column \(\mathbf {w}_{[i]}\) of \(\mathbf {W}\) and a column \(\mathbf {s}_j\) of \(\mathbf {S}\) such that \(\mathbf {w}_{[i]} = \mathbf {s}_j\). Then, we observe that the task of the prover \( \mathcal {P}\) is equivalent to proving the existence of \(\mathbf {W},\mathbf {g}, \mathbf {h}\) such that

$$ \mathbf {W} \cdot \mathbf {g} = \mathbf {S} \cdot \mathbf {h}~~ \wedge ~~\mathbf {g} \in \mathsf {B}(p-t+1, 1)~~\wedge ~~\mathbf {h} \in \mathsf {B}(k, 1). $$

To this end, we employ techniques for proving linear relation and quadratic relation (specifically the variant of \(\mathsf {LPN}\) relation), as well as, for fix-weight relations in the framework of Stern’s protocols. In the process, we prove the well-formedness of \(\mathbf {W}\). Details are in the full version of this paper. The resulting protocol has communication cost \(\mathcal {O}(t \cdot (p-t) + k)\) and is a sub-protocol in our \(\mathsf {FDGE}\) construction of Sect. 6, where we additionally prove that \(\mathbf {w}\) is the same as the plaintext encrypted in a given McEliece ciphertext.

On the other hand, to prove that \(\big ((\mathbf {s}_1, \ldots , \mathbf {s}_k), \mathbf {w}\big ) \in R_{\mathrm {prohibit}}\), we consider \((p-t+1)\cdot k\) pairs \((\mathbf {w}_{[i]}, \mathbf {s}_j)\) and aim to prove that all the sums \(\mathbf {z}_{i, j} = \mathbf {w}_{[i]} \oplus \mathbf {s}_j \in \{0,1\}^t\) have Hamming weight at least d. In other words, we reduce the problem to \((p-t+1)\cdot k\) sub-problems, for each of which, we needs to prove that \(\mathbf {z}_{i, j}\) contains at least d coordinates equal to 1. To this end, we perform the following extension trick, adapted from [37].

We append \((t-d)\) coordinates to \(\mathbf {z}_{i,j} \in \{0,1\}^t\) to get \(\mathbf {z}^{*}_{i, j} \in \{0,1\}^{2t-d}\) such that \(wt(\mathbf {z}^*_{i,j}) = t\), i.e., \(\mathbf {z}^{*}_{i,j} \in \mathsf {B}(2t-d, t)\). We note that such an extension is always possible if \(wt(\mathbf {z}_{i,j}) \ge d\). Furthermore, the converse also holds: if \(\mathbf {z}^{*}_{i,j} \in \mathsf {B}(2t-d, t)\), then the original \(\mathbf {z}_{i,j}\) must have weight at least \(t - (t-d) = d\). Details are in the full version of this paper. As a result, we obtain a \(\mathsf {ZK}\) protocol for \(R_{\mathrm {prohibit}}\) with communication cost \(\mathcal {O}(t \cdot (p-t+1)\cdot k)\). Similarly to the case of \(R_{\mathrm {permit}}\), this protocol can serve as a sub-protocol in our \(\mathsf {FDGE}\) construction of Sect. 6, allowing us to realize the “prohibitive” filtering policy.

5 Fully Dynamic Group Encryption: Model and Security Requirements

In this section, we first present the model of fully dynamic group encryption \(\mathsf {FDGE}\) that offers both dynamic join and revocation, which is developed from the one proposed by Kiayias et al. [29]. Our model is analogous to the fully dynamic group signature one proposed by Bootle et al. [10]. In a \(\mathsf {FDGE}\) scheme, the parties involved are the sender, the verifier, the group manager \(\mathsf {GM}\) who manages the group of receivers, and the opening authority \(\mathsf {OA}\) who is capable of identifying the recipients of ciphertexts. \(\mathcal {R}\) is a public relation for which a \(\mathsf {FDGE}\) should be verifiable. Receivers can join and leave the group at the choice of the \(\mathsf {GM}\). We assume that the \(\mathsf {GM}\) will publish group information \(\mathsf {info}_{\tau }\) at the beginning of each time epoch \(\tau \). The information depicts changes to the group such as the existing group members or the revoked members at current epoch \(\tau \). It is required that anyone can verify the authenticity and well-formedness of the group information. In addition, by comparing the current group information with the previous one, it is possible to recover the list of members revoked from the group at the current epoch. We also assume that the epoch maintains the order in which the group information was published, i.e., \(\mathsf {info}_{\tau _1}\) precedes \(\mathsf {info}_{\tau _2}\) if \(\tau _1 <\tau _2\).

Compared to [29], our model enables the \(\mathsf {GM}\) to remove some users from the group through a group updating algorithm \(\mathsf {GUpdate}\). Another difference is that we avoid interaction by employing a non-interactive zero-knowledge \((\mathsf {NIZK})\) proof, which has already been considered by Cathalo, Libert and Yung [15]. As highlighted by the authors, non-interaction is highly desirable as the sender, who might be required to repeat the proof with many verifiers, needs to maintain a state and remember all the random coins used to generate the ciphertext.

Formally, a \(\mathsf {FDGE}\) that is verifiable for a public relation \(\mathcal {R}\) consists of the following polynomial-time algorithms.

  • \(\mathsf {Setup}_{\mathsf {init}}(1^{\lambda })\) The algorithm takes as input the security parameter \(1^{\lambda }\) and outputs a set of public parameters \(\mathsf {pp}\).

  • \(\mathsf {Setup}_{\mathsf {OA}}(\mathsf {pp})\) This algorithm is run by the opening authority \(\mathsf {OA}\). It takes as input \(\mathsf {pp}\) and outputs a key pair \((\mathsf {pk}_{\mathsf {OA}}, \mathsf {sk}_{\mathsf {OA}})\).

  • \(\mathsf {Setup}_{\mathsf {GM}}(\mathsf {pp})\) This algorithm is run by the group manger \(\mathsf {GM}\). It takes as input the public parameters \(\mathsf {pp}\) and outputs a key pair \((\mathsf {pk}_{\mathsf {GM}}, \mathsf {sk}_{\mathsf {GM}})\). Meanwhile, \(\mathsf {GM}\) initializes the group information \(\mathsf {info}\) and a public registration directory \(\mathbf {reg}\).

  • \(\mathsf {G}_{\mathcal {R}}(1^{\lambda })\) This randomized algorithm takes as input the security parameter \(\lambda \) and outputs public and secret parameters \((\mathsf {pk}_{\mathcal {R}},\mathsf {sk}_{\mathcal {R}})\) for the relation \(\mathcal {R}\). Note that \(\mathsf {sk}_{\mathcal {R}}\) is an empty string if a publicly samplable relation \(\mathcal {R}\) is considered.

  • \(\mathsf {Sample}_{\mathcal {R}}(\mathsf {pk}_{\mathcal {R}},\mathsf {sk}_{\mathcal {R}})\) This probabilistic algorithm takes \((\mathsf {pk}_{\mathcal {R}},\mathsf {sk}_{\mathcal {R}})\) as input and outputs a statement and witness pair (xw).

  • \(\mathcal {R}(\mathsf {pk}_{\mathcal {R}},x,w)\) The polynomial-time testing algorithm takes as input \((\mathsf {pk}_{\mathcal {R}},x,w)\) and returns 1 if and only if (xw) is in the relation based on the public parameter \(\mathsf {pk}_{\mathcal {R}}\).

  • \(\langle \mathsf {Join},\mathsf {Issue}(\mathsf {sk}_{\mathsf {GM}}) \rangle (\mathsf {pk}_{\mathsf {GM}},\mathsf {info}_{\tau _{\mathrm {current}}})\) This is an interactive protocol securely run between a user and the \(\mathsf {GM}\). Both the \(\mathsf {Join}\) and \(\mathsf {Issue}\) algorithms takes as inputs \(\mathsf {pk}_{\mathsf {GM}}\) and \(\mathsf {info}_{\tau _{\mathrm {current}}}\) at current time epoch \(\tau _{\mathrm {current}}\) while the the latter algorithm takes \(\mathsf {sk}_{\mathsf {GM}}\) as an additional input. Upon successful completion, the algorithm \(\mathsf {Join}\) outputs a user key pair \((\mathsf {pk},\mathsf {sk})\) while \(\mathsf {Issue}\) adds a new record in the directory \(\mathbf {reg}\). Note that \(\mathsf {GM}\) may update group information and that \(\mathbf {reg}\) may store information like user identifier or user public key that may be used by \(\mathsf {GM}\) and \(\mathsf {OA}\) for later updating and opening.

  • \(\mathsf {GUpdate}(\mathsf {sk}_{\mathsf {GM}},\mathcal {S},\mathsf {info}_{\tau _{\mathrm {current}}},\mathbf {reg})\) This algorithm is run by the \(\mathsf {GM}\) who will advance the epoch and update the group information. Given the secret key \(\mathsf {sk}_{\mathsf {GM}}\), a set \(\mathcal {S}\) of active users to be deleted from the group, current group information \(\mathsf {info}_{\tau _{\mathrm {current}}}\), and the directory \(\mathbf {reg}\), the \(\mathsf {GM}\) computes new group information \(\mathsf {info}_{\tau _{\mathrm {new}}}\) and may update the directory \(\mathbf {reg}\) as well. If there is no change to the group information or \(\mathcal {S}\) contains inactive users (who has never joined the group yet or who has been revoked from the group), this algorithm aborts.

  • \(\mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },w,\mathsf {pk},L)\) This randomized encryption algorithm is run by the sender who wishes to encrypt a witness w for its chosen user \(\mathsf {pk}\). It returns a ciphertext \(\psi \) with a certain label L. As in [29], L is a public string bound to the ciphertext that may contain some transaction related data or be empty. If \(\mathsf {pk}\) is not an active user at current time epoch \(\tau \) or \(\mathcal {R}(\mathsf {pk}_{ \mathcal {R}},x,w)=0\), this algorithm aborts. Let \(\mathsf {coins}_{\psi }\) be the random coins used to generate \(\psi \).

  • \(\mathcal {P}\big (\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },\mathsf {pk}_{\mathcal {R}},x,\psi ,L,w,\mathsf {pk} ,\mathsf {coins}_{\psi }\big )\) This randomized proof algorithm is run by the sender who acts as a prover and demonstrates the honest computation of ciphertext \(\psi \). Given all the inputs, it outputs a proof \(\pi _{\psi }\). The proof ensures that there exists a certified and active group member at time \(\tau \), who is able to decrypt \(\psi \) and obtain \(w'\) such that \(\mathcal {R}(\mathsf {pk}_{\mathcal {R}},x,w')=1\), and whose public key is encrypted under \(\mathsf {pk}_{\mathsf {OA}}\) and can be later revealed using the \(\mathsf {OA}\)’s secret key \(\mathsf {sk}_{\mathsf {OA}}\).

  • \(\mathcal {V}\big ((\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },\mathsf {pk}_{\mathcal {R}},x,\psi ,L),\pi _{\psi }\big )\) This verification algorithm is run by any verifier who on input the tuple \((\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\mathsf {info}_{\tau },\mathsf {pk}_{\mathcal {R}},x,\psi ,L)\) and a corresponding proof \(\pi _{\psi }\) outputs bit 1 or 0. If the output is 1, we say the proof \(\pi _{\psi }\) is valid.

  • \(\mathsf {Dec}(\mathsf {info}_{\tau },\mathsf {sk},\psi ,L)\) This decryption algorithm is run by the user in possession of the secret key \(\mathsf {sk}\). Given all the inputs, it outputs \(w'\) such that \(\mathcal {\mathcal {R}}(\mathsf {pk}_{\mathcal {R}},x,w')=1\) or \(\bot \) otherwise.

  • \(\mathsf {Open}(\mathsf {info}_{\tau },\mathsf {sk}_{\mathsf {OA}},\psi ,L)\) This opening algorithm is run by the \(\mathsf {OA}\) who holds the key \(\mathsf {sk}_{\mathsf {OA}}\). Given the inputs, it returns an active user public key \(\mathsf {pk}\) or \(\bot \) to indicate opening failure.

To ease the notations, we additionally use the following algorithms in the security experiments.

  • \(\mathsf {IsActive}(\mathsf {info}_{\tau },\mathsf {pk})\) This algorithm returns 1 if user \(\mathsf {pk}\) is an active user at time \(\tau \) and 0 otherwise.

Correctness. Informally, correctness of a \(\mathsf {GE}\) scheme requires that an honest proof of correct encryption is always valid, that the designated receiver can always recover the encrypted message, and that the \(\mathsf {GM}\) is capable of identifying the receiver. We model this requirement in the experiment \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {corr}}(1^{\lambda })\). Below, we first define some oracles that are accessible to the adversary.

  • \(\mathsf {AddU}(\mathsf {sk}_{\mathsf {GM}})\) This oracle adds an honest user to the group at current time \(\tau _{\mathrm {current}}\). It simulates the interactive protocol \(\langle \mathsf {Join},\mathsf {Issue}(\mathsf {sk}_{\mathsf {GM}})\rangle (\mathsf {pk}_{\mathsf {GM}},\mathsf {info}_{\tau _{\mathrm {current}}})\) and maintains an honest user list \(\mathsf {HUL}\). Let the output of \(\mathsf {Join}\) be \((\mathsf {pk},\mathsf {sk})\). It then adds \(\mathsf {pk}\) to \(\mathsf {HUL}\).

  • \(\mathsf {GUp}(\cdot )\) This oracle allows the adversary to remove a set of active users from the group at current time epoch \(\tau _{\mathrm {current}}\). When a set \(\mathcal {S}\) is queried, it advances the time epoch to \(\tau _{\mathrm {new}}\) and updates the group information to \(\mathsf {info}_{\tau _{\mathrm {new}}}\) by executing the algorithm \(\mathsf {GUpdate}(\mathsf {sk}_{\mathsf {GM}},\mathcal {S},\mathsf {info}_{\tau _{\mathrm {current}}},\mathbf {reg})\). As the algorithm \(\mathsf {GUpdate}\), it may update the \(\mathbf {reg}\).

Definition 1

Define \(\mathbf {Adv}_{\mathcal {A}}^{\mathsf {corr}}(1^{\lambda })=\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {corr}}(1^{\lambda })=1]\) as the advantage of an adversary \(\mathcal {A}\) against correctness in the experiment \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {corr}}(1^{\lambda })\). A \(\mathsf {FDGE}\) is correct if, for any PPT adversary \(\mathcal {A}\), the advantage of \(\mathcal {A}\) is negligible in \(\lambda \).

figure a

5.1 Formulation of the Security Requirements

We now present three security requirements: message secrecy, anonymity, and soundness for \(\mathsf {FDGE}\), which are carefully adapted from the dynamic case. We formulate those requirements through experiments that are run between a challenger and an adversary. As mentioned earlier, the adversary is empowered with attack capability to the maximum extent possible. Specifically, in the definition of message secrecy and anonymity, it fully corrupts \(\mathsf {GM}\) and/or \(\mathsf {OA}\) and generates keys arbitrarily on behalf of them. Regarding soundness, only partial corruption of the \(\mathsf {OA}\) whose key is still honestly generated is allowed. Details of the security requirements are described below.

Message Secrecy. This security notion protects the appointed receiver from a malicious adversary who tries to extract the information about the encrypted message. It requires that the adversary cannot distinguish a ciphertext that is an encryption of a real witness or encryption of a randomly chosen one even though it could fully corrupt the \(\mathsf {GM}\), the \(\mathsf {OA}\), and all group members except one that is chosen as the receiver. We model this requirement using \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sec}-b}(1^\lambda )\) for \(b\in \{0,1\}\). In the following, we define some oracles that will be used in the experiment.

  • \(\mathsf {USER}()\) This oracle simulates the algorithm \(\mathsf {Join}\), when interacted with adversary \(\mathcal {A}\) who plays the role of \(\mathsf {GM}\), to introduce an honest user to the group at current time \(\tau _{\mathrm {current}}\). it maintains an honest user list \(\mathsf {HUL}\). Let the output of this oracle be \((\mathsf {pk},\mathsf {sk})\) and add \(\mathsf {pk}\) to \(\mathsf {HUL}\).

  • \(\mathsf {RevealU}(\cdot )\) This oracle allows the adversary to learn an honest user secret key. It maintains a bad user list \(\mathsf {BUL}\). When a user public key \(\mathsf {pk}\) is queried, it returns the corresponding secret key \(\mathsf {sk}\) and adds \(\mathsf {pk}\) to \(\mathsf {BUL}\) if \(\mathsf {pk}\notin \mathsf {BUL}\), and aborts otherwise.

  • \(\mathsf {CH}_{\mathsf {ror}}^b(\tau ,\mathsf {pk},w,L)\) This is a real-or-random challenge oracle which is only called once. It returns \((\psi ,\mathsf {coins}_{\psi })\) such that \(\psi \leftarrow \mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\mathsf {info}_{\tau },w,\mathsf {pk},L)\) if \(b=1\), whereas if \(b=0\) \(\psi \leftarrow \mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\mathsf {info}_{\tau },w',\mathsf {pk},L)\) where \(w'\) is sampled uniformly from the space of all possible plaintexts. In both cases, \(\mathsf {coins}_{\psi }\) are the random coins used for the computation of the challenged ciphertext \(\psi \).

  • \(\mathsf {DEC}(\mathsf {sk},\cdot )\) This is an oracle for the decryption function \(\mathsf {Dec}\). When \((\psi ,\tau ,L)\) is queried to this oracle, it returns the output of \(\mathsf {Dec}(\mathsf {info}_{\tau },\mathsf {sk},\psi ,L)\). When a tuple \((\mathsf {pk},\psi ,\tau ,L)\) should be rejected by this oracle, we write \(\mathsf {DEC}^{\lnot (\psi ,\tau ,L,)}(\cdot )\).

  • \(\mathsf {PROVE}_{\mathcal {P},\mathcal {P'}}^b(\mathsf {pk},\tau ,\mathsf {pk}_{\mathcal {R}},x,w,L,\psi ,\mathsf {coins}_{\psi })\) This oracle can be invoked a polynomial number times. It generates proofs of validity of the challenged ciphertext. If \(b=1\), let \(\pi _{\psi } \leftarrow \mathcal {P}\big (\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },\mathsf {pk}_{\mathcal {R}},x,\psi ,L,w,\mathsf {pk} ,\mathsf {coins}_{\psi }\big )\) and return the output \(\pi _{\psi }\). If \(b=0\), it runs a simulator \(\mathcal {P}'\) that takes the same inputs as \(\mathcal {P}\) except \((w,\mathsf {coins}_{\psi })\) and returns whatever \(\mathcal {P}'\) outputs.

In the experiment \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sec}-b}(1^{\lambda })\), the adversary \(\mathcal {A}\) fully controls the \(\mathsf {GM}\) and the \(\mathsf {OA}\), and enrolls honest users to the group by interacting with the oracle \(\mathsf {USER}\). It is entitled to corrupt at most all but one honest users by querying the \(\mathsf {RevealU}\) oracle and to update the group information, insofar as \(\mathsf {info}\) and \(\mathbf {reg}\) are well-formed. At some point, the adversary chooses a targeted receiver \(\mathsf {pk}^*\) and has access to the \(\mathsf {DEC}\) oracle with respect to \(\mathsf {pk}^*\). It then specifies a certain epoch \(\tau ^*\), a label \(L^*\) together with the relation \(\mathsf {pk}_{\mathcal {R}}^*\) and the statement witness pair \((x^*,w^*)\). Afterwards, the challenger encrypts the witness \(w^*\) if \(b=1\) or a random message if \(b=0\) to the receiver \(\mathsf {pk}^*\), and sends the resultant ciphertext \(\psi ^*\) to \(\mathcal {A}\). After receiving it, \(\mathcal {A}\) is allowed to query the \(\mathsf {PROVE}\) oracle for proofs of its validity and still has access to the \(\mathsf {DEC}\) oracle with respect to \(\mathsf {pk}^*\) with the natural restriction that \((\psi ^*,\tau ^*,L^*)\) is forbidden. Finally, \(\mathcal {A}\) is asked to guess the challenger’s choice.

Definition 2

Let the advantage of an adversary \(\mathcal {A}\) against message secrecy be \(\mathbf {Adv}_{\mathcal {A}}^{\mathsf {sec}-b}(1^{\lambda })=|\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sec}-1}(1^{\lambda })=1]-\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sec}-0}(1^{\lambda })=1]|\). A \(\mathsf {FDGE}\) satisfies message secrecy if, for any PPT adversary \(\mathcal {A}\), the advantage of \(\mathcal {A}\) is negligible in \(\lambda \).

figure b

Anonymity. This notion aims to prevent the adversary from learning information about the identity of the receiver of a ciphertext. It requires that an adversary without possession of the secret key of \(\mathsf {OA}\) is not capable of distinguishing which one of two group members of its choice is the recipient of a ciphertext. Note that the adversary is forbidden from corrupting these two challenged members since they know whether a ciphertext is intended for them by simply decrypting it. We model this requirement in \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {anon}-b}(1^\lambda )\) for \(b\in \{0,1\}\), which will utilize the following challenge oracle \(\mathsf {CH}_{\mathsf {anon}}^b\) and opening oracle \(\mathsf {OPEN}\).

  • \(\mathsf {CH}_{\mathsf {anon}}^b(\tau ,\mathsf {pk}_0,\mathsf {pk}_1,w,L)\) This is a challenge oracle that can be called only once. It returns \((\psi ,\mathsf {coins}_{\psi })\) such that \(\psi \leftarrow \mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\mathsf {info}_{\tau },w,\mathsf {pk}_b,L)\).

  • \(\mathsf {OPEN}(\mathsf {sk}_{\mathsf {OA}},\cdot )\) This is an oracle for the opening algorithm \(\mathsf {Open}\). When decryption of a tuple \((\psi ,\tau ,L)\) is requested, it returns \(\mathsf {Open}(\mathsf {info}_{\tau },\mathsf {sk}_{\mathsf {OA}},\psi ,L)\). When a tuple \((\psi ,\tau ,L)\) is forbidden, we write \(\mathsf {OPEN}^{\lnot (\mathsf {\psi },\tau ,L)}(\mathsf {sk}_{\mathsf {OA}},\cdot )\).

In the experiment, the adversary \(\mathcal {A}\) can fully corrupt the \(\mathsf {GM}\). By interacting with the oracles \(\mathsf {USER},\mathsf {RevealU}\), it can also introduce honest users to the group and learn up to all but two secret keys at a later point. As in the \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sec}-b}(1^\lambda )\), \(\mathcal {A}\) is allowed to update the group at its will, provided that the group information \(\mathsf {info}\) and \(\mathbf {reg}\) are well-formed. Moreover, \(\mathcal {A}\) has access to the \(\mathsf {OPEN}(\mathsf {sk}_{\mathsf {OA}},\cdot )\) oracle. At some point, \(\mathcal {A}\) specifies two targeted receivers \(\mathsf {pk}_0^*,\mathsf {pk}_1^*\) and is granted access to the \(\mathsf {DEC}\) oracle with respect to both recipients. Next, it outputs a specific epoch \(\tau ^*\) and \((\mathsf {pk}_{\mathcal {R}}^*,x^*,w^*)\) to the challenger, who will encrypt the witness to receiver \(\mathsf {pk}_b^*\). Thereafter, the challenger sends the challenge ciphertext \(\psi ^*\) to \(\mathcal {A}\). The latter is further allowed to query the proof of validity of \(\psi ^*\) and accessible to oracles \(\mathsf {DEC}(\mathsf {sk}_0^*,\cdot ),\mathsf {DEC}(\mathsf {sk}_1^*,\cdot ), \mathsf {OPEN}(\mathsf {sk}_{\mathsf {OA}},\cdot )\) with the constraint that the tuple \((\psi ^*,\tau ^*,L^*)\) is not queried to any of the oracles. Lastly, \(\mathcal {A}\) is asked to guess which one of the two users is the challenger’s choice.

Definition 3

Define the advantage of an adversary \(\mathcal {A}\) against anonymity as \(\mathbf {Adv}_{\mathcal {A}}^{\mathsf {anon}}(1^{\lambda })=|\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {anon}-1}(1^{\lambda })=1]-\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {anon}-0}(1^{\lambda })=1]|\). A \(\mathsf {FDGE}\) satisfies anonymity if, for any PPT adversary \(\mathcal {A}\), the advantage of \(\mathcal {A}\) is negligible in \(\lambda \).

figure c

Soundness. This notion requires that the adversary cannot generate a ciphertext with a valid proof associated with time epoch \(\tau \) such that (1) the opening of the ciphertext is a public key that does not belong to any active group member at time \(\tau \), (2) the revealed public key is not in the language \(\mathcal {L}_{\mathsf {pk}}^{\mathsf {pp}}\) of valid public keys, (3) the ciphertext is not in the space \(\mathcal {L}_{\mathsf {ciphertext}}^{(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\tau ,\mathsf {pk}_{\mathcal {R}},x,L,\mathsf {pk})}\) of valid ciphertexts. Note that \(\mathcal {L}_{\mathsf {pk}}^{\mathsf {pp}}=\{ \mathsf {pk}: \exists ~\mathsf {sk}~\text {such that}~(\mathsf {pk},\mathsf {sk})~\text {is a valid user key pair}\}\) and that

$$\begin{aligned} \mathcal {L}_{\mathsf {ciphertext}}^{(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\tau ,\mathsf {pk}_{\mathcal {R}},x,L,\mathsf {pk})}=\{ \psi : \exists ~w~\text {such that}~ \psi =\mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}},\mathsf {info}_{\tau },w,\mathsf {pk},L), \\ \mathcal {\mathcal {R}}(\mathsf {pk}_{\mathcal {R}},x,w)=1, ~\text {and}~\mathsf {IsActive}(\mathsf {info}_{\tau },\mathsf {pk})=1 \}. \end{aligned}$$

We model this requirement in the experiment \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sound}}(1^\lambda )\). The adversary is given the secret key of \(\mathsf {OA}\) and is permitted to adaptively register users to the group through oracle queries \(\mathsf {REG}(\mathsf {sk}_{\mathsf {GM}})\), as defined below. In addition, it can remove some users from the group by querying the oracle \(\mathsf {GUp}(\cdot )\).

  • \(\mathsf {REG}(\mathsf {sk}_{\mathsf {GM}})\) This oracle simulates the \(\mathsf {GM}\) and runs the algorithm \(\mathsf {Issue}\). When queried by adversary \(\mathcal {A}\) who plays the role of a user, it interacts with \(\mathcal {A}\) and if successful registers an adversarially controlled user to the group at current time \(\tau _{\mathrm {current}}\). As the algorithm \(\mathsf {Issue}\), it maintains a public directory \(\mathbf {reg}\) and may update the group information as well.

Definition 4

Define \(\mathbf {Adv}_{\mathcal {A}}^{\mathsf {sound}}(1^{\lambda })=\Pr [\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sound}}(1^{\lambda })=1]\) as the advantage of an adversary \(\mathcal {A}\) against soundness in the experiment \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {sound}}(1^{\lambda })\). A \(\mathsf {FDGE}\) satisfies soundness if, for any PPT adversary \(\mathcal {A}\), the advantage of \(\mathcal {A}\) is negligible in \(\lambda \).

figure d

6 A Code-Based Fully Dynamic Group Encryption Scheme

To build a code-based \(\mathsf {FDGE}\) scheme, we require a key-private CCA2-secure encryption scheme [3], a digital signature scheme, and a zero-knowledge proof (argument) of knowledge protocol. In this paper, we work with the \(\mathsf {ZKAoK}\) within Stern’s framework [49]. In terms of the encryption scheme, we choose to work with the McEliece cryptosystem [39], specifically the randomized variant from [46]. The latter indeed has pseudorandom ciphertexts, which implies key-private CPA-security. To further achieve CCA2-security, we apply the Naor-Yung double encryption technique [44]. Note that there are other CCA2-secure variants of McEliece scheme like [16, 19, 30]. However, they either do not operate well in the Stern’s framework or are completely impractical. Regarding the digital signature, we employ the Merkle-tree accumulator suggested in [45]. Precisely, when a user requests to join the group, it first generates its encryption key pair \((\mathsf {pk},\mathsf {sk})\), and sends \(\mathsf {pk}\) and its non-zero hash value \(\mathbf {d}\) to \(\mathsf {GM}\). The latter, if accepts, then computes the Merkle tree root, where the leaf nodes are the hash values of all users. The witness for \(\mathbf {d}\) is the proof of user’s membership. To achieve dynamicity, following [38], we use an updating algorithm akin to [38] to set up the system so that (1) the value of the leaf node associated with a user who has not joined or who has been removed from the group is \(\mathbf {0}\) (2) while it is updated to \(\mathbf {d}\) when this user joins the group. When a sender encrypts messages to a user at some epoch, it has to show that the user’s non-zero hash value is accumulated in the tree in this epoch. This mechanism effectively distinguish active users who are valid recipients of ciphertexts from those who are not.

As in the KTY model [29], we also require that user encryption keys are valid (i.e., in the language \(\mathcal {L}_{\mathsf {pk}}^{\mathsf {pp}}\)). One possible solution would be requiring a proof of knowledge of the McEliece decryption key when a user joins the group. This is however quite complicated and inefficient. Instead, \(\mathsf {GM}\) encrypts random messages under the user’s encryption key and asks the user to output the correct messages. By choosing the parameters properly, the running time of guessing correctly the messages if the user does not know the underlying decryption key is exponential. This then enforces validity of user encryption keys.

6.1 Description of the Scheme

Our scheme allows encryption witness \(\mathbf {w}\in \{0,1\}^p\) that satisfies the permissive relation \(R_{\mathrm {permit}}\) and/or the prohibitive relation \(R_{\mathrm {prohibit}}\). For simplicity, we present \(R_{\mathrm {permit}}\) in the following construction. The details are described below.

  • \(\mathsf {Setup}_{\mathsf {init}}(1^{\lambda })\) On input the security parameter \(1^\lambda \), this algorithm proceeds as follows.

    • Specify an integer \(\ell =\ell (\lambda )\) that determines the maximum expected number \(N= 2^\ell \) of potential users.

    • Choose \(n=\mathcal {O}(\lambda )\), \(c=\mathcal {O}(1)\) such that c divides n, and set \(m=2^c\cdot \frac{2n}{c}\). Choose an integer \(t_m <m\).

    • Choose \(t_1=t_1(\lambda )\), \(k_1=k_1(\lambda )\) and \(t_2=t_2(\lambda )\), \(k_2=k_2(\lambda )\) such that \((n,k_1,t_1)\), \((n,k_2,t_2)\) are two sets of parameters for the McEliece encryption scheme.

    • Sample a random matrix \(\mathbf {B}\xleftarrow {\$}\mathbb {Z}_2^{n\times m}\) that specifies a hash function \(h_{\mathbf {B}}\) that will be used build Merkle tree (see the full version of this paper, as well as [45]).

    • Pick a statistical hiding and computationally binding commitment scheme \(\mathsf {COM}:\{0,1\}^*\rightarrow \{0,1\}^n\) like the one in [45, Section 3.1]. This will serve as a building block for the \(\mathsf {ZK}\) argument systems.

    • Let \(\mathcal {H}_{\mathsf {FS}}:\{0,1\}^*\rightarrow \{1,2,3\}^{\kappa }\), where \(\kappa =\omega (\log \lambda )\), be a hash function that will be modeled as a random oracle in the Fiat-Shamir transforms [21].

    Output public parameters

    $$\mathsf {pp}=\{N,\ell ,n,c,m,t_m,t_1,k_1,p,t_2,k_2,v,\mathbf {B},\mathsf {COM},\kappa ,\mathcal {H}_{\mathsf {FS}} \}.$$
  • \(\mathsf {Setup}_{\mathsf {OA}}(\mathsf {pp})\) This algorithm is run by the \(\mathsf {OA}\). Given the input \(\mathsf {pp}\), it triggers the McEliece key generation algorithm \(\mathsf {KeyGen}_{\mathsf {ME}}(n,k_1,t_1) \) (see the full version) twice to obtain encryption key pairs \((\mathbf {G}_{\mathsf {oa},0},\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},0)})\) and \((\mathbf {G}_{\mathsf {oa},1},\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},1)})\). Set \(\mathsf {pk}_{\mathsf {OA}}= (\mathbf {G}_{\mathsf {oa},0}, \mathbf {G}_{\mathsf {oa},1})\) and \(\mathsf {sk}_{\mathsf {OA}}=(\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},0)},\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},1)})\).

  • \(\mathsf {Setup}_{\mathsf {GM}}(\mathsf {pp})\) This algorithm is run by the \(\mathsf {GM}\). It samples \(\mathsf {sk}_{\mathsf {GM}}\xleftarrow {\$}\mathsf {B}(m,t_m)\), then computes \(\mathsf {pk}_{\mathsf {GM}}= \mathbf {B}\cdot \mathsf {sk}_{\mathsf {GM}}\bmod 2\), and outputs \((\mathsf {pk}_{\mathsf {GM}}, \mathsf {sk}_{\mathsf {GM}})\). It also initializes the following.

    • Let the registration table be \(\mathbf {reg}: = (\mathbf {reg}[0],\mathbf {reg}[1], \ldots , \mathbf {reg}[N-1])\), where for each \(i \in [0, N-1]\): \(\mathbf {reg}[i][1] = \mathbf {0}^{n},\mathbf {reg}[i][2] = -1\), and \(\mathbf {reg}[i][3] = -1\). Here, \(\mathbf {reg}[i][1]\) denotes the hash value of the public encryption key of a registered user while \(\mathbf {reg}[i][2],\mathbf {reg}[i][3]\) represent the epoch at which the user joins and leaves the group, respectively.

    • Construct a Merkle tree \(\mathcal {T}\) on top of \(\mathbf {reg}[0][1], \ldots , \mathbf {reg}[N-1][1]\). (Note that \(\mathcal {T}\) is an all-zero tree at this stage, when a new user joins the group, it will affect the Merkle tree.)

    • Initialize a counter of registered users \(j: = 0\).

    Then, GM outputs its public key \( \mathsf {pk}_{\mathsf {GM}}\) and announces \(\mathbf {reg}\) and the initial group information \(\mathsf {info} = \emptyset \) while keeping \(\mathcal {T}\) and j for himself. We remark that \(\mathbf {reg}\) and \(\mathsf {info}\) are visible to everyone but only editable by a party who knows \(\mathsf {sk}_{\mathsf {GM}}\). In addition, anyone is able to verify the well-formedness of \(\mathbf {reg}\) and \(\mathsf {info}\).

  • \(\langle \mathsf {G}_{\mathcal {R}},\mathsf {Sample}_{\mathcal {R}}\rangle \) The algorithm \(\mathsf {G}_{\mathcal {R}}(1^\lambda ,\mathsf {pp})\) proceeds by sampling parameters tk for the relation \(R_{\mathrm {permit}}\) (8). Let \((\mathsf {pk}_{\mathcal {R}},\mathsf {sk}_{\mathcal {R}})=((p,t,k),\epsilon )\). Given \(\mathsf {pk}_{\mathcal {R}}\), the algorithm \(\mathsf {Sample}_{\mathcal {R}}\) outputs a set of keywords \(S=\{\mathbf {s}_1,\ldots , \mathbf {s}_k \}\), \(\mathbf {w}\in \mathbb {Z}_2^p\) such that \((S,\mathbf {w})\in R_{\mathrm {permit}}\).

  • \(\langle \mathsf {Join}, \mathsf {Issue}\rangle \). This is an interactive protocol securely run between a user and the \(\mathsf {GM}\). If a user requests to join the group at epoch \(\tau \), he will follow steps below.

    1. 1.

      The user first generates its encryption key pair. It runs McEliece key generation \(\mathsf {KeyGen}_{\mathsf {ME}}(n,k_2,t_2)\) twice, obtaining \((\mathbf {G}_{0},\mathsf {sk}_{\mathsf {ME}}^{(0)})\) and \((\mathbf {G}_{1},\mathsf {sk}_{\mathsf {ME}}^{(1)})\). Set encryption key \(\mathsf {pk}'=(\mathbf {G}_{0},\mathbf {G}_{1})\) and secret key \(\mathsf {sk}=(\mathsf {sk}_{\mathsf {ME}}^{(0)},\mathsf {sk}_{\mathsf {ME}}^{(1)})\).

    2. 2.

      It then computes the hash of its encryption key \(\mathsf {pk}'\). For \(b\in \{0, 1\}\), write \(\mathbf {G}_b=[\mathbf {g}_{k_2b}|\cdots |\mathbf {g}_{k_2b+k_2-1}]\). Let \(D=\{\mathbf {g}_{0},\mathbf {g}_{1},\ldots , \mathbf {g}_{2k_2-1}\}\). It then runs the accumulation algorithm \(\mathsf {Accu}_{\mathbf {B}}(D)\) (see the full version) to build a (sub)-Merkle tree based on D and the hash function \(h_{\mathbf {B}}\), obtaining an accumulated hash value \(\mathbf {d}\in \mathbb {Z}_2^n\). We call \(\mathbf {d}\) the hash of \(\mathsf {pk}'\). If there is no ambiguity, we sometimes write \(\mathsf {Accu}_{\mathbf {B}}(\mathsf {pk}')\) instead of \(\mathsf {Accu}_{\mathbf {B}}(D)\).

    3. 3.

      If \(\mathbf {d}=\mathbf {0}^n\), the user repeats Step 1 and 2. Otherwise, he sends \((\mathsf {pk}',\mathbf {d})\) to the \(\mathsf {GM}\).

    Upon receiving the tuple \((\mathsf {pk}',\mathbf {d})\) from the user, the \(\mathsf {GM}\) first computes the ranks \(r_1,r_2\) of \(\mathbf {G}_0,\mathbf {G}_1\), respectively, and \(\mathbf {d}'=\mathsf {Accu}_{\mathbf {B}}(\mathsf {pk}')\). If \(r_1\ne k_2\) or \(r_2\ne k_2\) or \(\mathbf {d}'\ne \mathbf {d}\) or \(\mathbf {d}'=\mathbf {0}^n\), \(\mathsf {GM}\) rejects. Otherwise, the two parties proceed as follows.

    1. 1.

      First, \(\mathsf {GM}\) encrypts two random messages by running the deterministic McEliece encryption algorithm using the key \(\mathsf {pk}'\). It first samples \(\mathbf {m}_0,\mathbf {m}_1\xleftarrow {\$}\mathbb {Z}_2^{k_2}\) and \(\mathbf {e}_0,\mathbf {e}_1\xleftarrow {\$}\mathsf {B}(n,t_2)\), then computes \(\mathbf {y}_0=\mathbf {G}_{0}\cdot \mathbf {m}_0\oplus \mathbf {e}_0,\mathbf {y}_1=\mathbf {G}_{1}\cdot \mathbf {m}_1\oplus \mathbf {e}_1\), and sends \(\mathbf {y}_0,\mathbf {y}_1\) to the user.

    2. 2.

      Upon receiving the ciphertexts, user runs the deterministic McEliece decryption algorithm, obtaining \(\mathbf {m}_0',\mathbf {m}_1'\). The user then sends \(\mathbf {m}_0',\mathbf {m}_1'\) to the \(\mathsf {GM}\).

    3. 3.

      If \(\mathbf {m}_0'\ne \mathbf {m}_0\) or \( \mathbf {m}_1'\ne \mathbf {m}_1\), \(\mathsf {GM}\) rejects. Otherwise \(\mathsf {GM}\) issues an identifier to the user as \(\mathsf {uid} = \mathsf {bin}(j) \in \{0,1\}^\ell \). The user then sets his public key as \(\mathsf {pk}=(\mathsf {pk}',\mathsf {bin}(j))\). From now on, we write \(\mathsf {pk}_j'=(\mathbf {G}_{j,0},\mathbf {G}_{j,1}), \mathsf {sk}_j=(\mathsf {sk}_{\mathsf {ME}}^{(j,0)},\mathsf {sk}_{\mathsf {ME}}^{(j,1)})\) to distinguish keys of different users.

    4. 4.

      \(\mathsf {GM}\) also performs the following updates:

      • Update \(\mathcal {T}\) by running the algorithm \(\mathsf {TUpdate}_{\mathbf {B}}(\mathsf {bin}(j), \mathbf {d})\).

      • Register the user to table \(\mathbf {reg}\) as \(\mathbf {reg}[j][1]:= \mathbf {d}\);   \(\mathbf {reg}[j][2]:= \tau \).

      • Increase the counter \(j: = j+1\).

  • \(\mathsf {GUpdate}(\mathsf {sk}_{\mathsf {GM}},\mathcal {S},\mathsf {info}_{\tau _{\mathrm {current}}},\mathbf {reg})\) This algorithm is run by GM to update the group information while also advancing the epoch to \(\tau _{\mathrm {new}}\). It works as follows.

    1. 1.

      Let the set \(\mathcal {S}\) contain all the identifiers of registered users to be revoked. If \(\mathcal {S} = \emptyset \), then go to Step 2.

      Otherwise, \(\mathcal {S} = \{i_1, \ldots ,i_r\}\), for some \(i_1, \ldots , i_r \in [0,N-1]\). Then, for all \(t \in [r]\), GM runs \(\mathsf {TUpdate}_{\mathbf {B}}(\mathsf {bin}(i_t), \mathbf {0}^{n})\) to update the tree \(\mathcal {T}\). Meanwhile, \(\mathsf {GM}\) updates \(\mathbf {reg}[j][3]=\tau _{\mathrm {new}}\).

    2. 2.

      At this point, each of the zero leaves in the tree \(\mathcal {T}\) corresponds to either a revoked user or a potential user who has not yet registered. In other words, only active users in the new epoch \(\tau _\mathrm{new}\) have non-zero hashes of their encryption keys, denoted by \(\{\mathbf {d}_j\}_j\), accumulated in the root \(\mathbf {u}_{\tau _\mathrm{new}}\) of the updated tree.

      For each j, let \(w^{(j)} \in \{0,1\}^\ell \times (\{0,1\}^{n})^\ell \) be the witness for the fact that \(\mathbf {d}_j\) is accumulated in \(\mathbf {u}_{\tau _\mathrm{new}}\). Then GM publishes the group information of the new epoch as:

      $$ \mathsf {info}_{\tau _\mathrm{new}} = \big (\mathbf {u}_{\tau _\mathrm{new}}, \{w^{(j)}\}_j\big ). $$

    We remark that even though \(\mathsf {info}_{\tau _{\mathrm {new}}}\) can be as large as \(\mathcal {O}( \lambda \cdot 2^\ell \cdot \ell )\), it is not necessary for the sender or verifier to download them all. In deed, the sender when running the \(\mathcal {P}\) algorithm only needs to download the respective witness \(w^{(j)} \) of size \(\mathcal {O}(\lambda \cdot \ell )\) bits. Meanwhile, the verifier who runs the \(\mathcal {V}\) algorithm only needs to download \(\mathbf {u}_{\tau _{\mathrm {new}}}\) of size \(\mathcal {O}(\lambda )\) bits. It is also worth noting that one is able to verify the well-formedness of registration table \(\mathbf {reg}\) from group information \(\mathsf {info}_{\tau _{\mathrm {current}}}\) and \(\mathsf {info}_{\tau _{\mathrm {new}}}\)Footnote 4, and vice versaFootnote 5.

  • \(\mathsf {Enc}(\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },\mathbf {w},\mathsf {pk},L)\) \(\mathsf {pk}_{\mathsf {OA}}=(\mathbf {G}_{\mathsf {oa},0},\mathbf {G}_{\mathsf {oa},1})\), \(\mathsf {pk}=(\mathsf {pk}_j',\mathsf {bin}(j))\) for some \(j\in [0,N-1]\) and let \(L\in \{0,1\}^*\). This algorithm is run by a sender who wishes to send a message \(\mathbf {w}\in \mathbb {Z}_2^p\) such that \((S,\mathbf {w})\in R_{\mathrm {permit}}\) to a chosen user j with encryption key \(\mathsf {pk}_j'\). If user j is an active user at current epoch \(\tau \), the sender downloads the corresponding witness \(w^{(j)}=(\mathsf {bin}(j), (\mathbf {w}_\ell , \cdots , \mathbf {w}_1))\) from \(\mathsf {info}_{\tau }\) and performs the following steps.

    1. 1.

      It first encrypts the message \(\mathbf {w}\) under the encryption key \(\mathsf {pk}_j'\).

      • Parse \(\mathsf {pk}_j'=(\mathbf {G}_{j,0},\mathbf {G}_{j,1})\).

      • Sample randomnesses \(\mathbf {r}_{w,0},\mathbf {r}_{w,1}\xleftarrow {\$} \mathbb {Z}_2^{k_2-p}\) and noises \(\mathbf {e}_{w,0},\mathbf {e}_{w,1}\xleftarrow {\$} \mathsf {B}(n,t_2)\).

      • For \(b\in \{0,1\}\), compute

        $$\begin{aligned} \mathbf {c}_{w,b} = \mathbf {G}_{j,b} \cdot \left( \begin{array}{c} \mathbf {r}_{w,b} \\ \mathbf {w} \\ \end{array} \right) \oplus \mathbf {e}_{w,b} \in \mathbb {Z}_2^{n}. \end{aligned}$$
        (10)

      Let \(\mathbf {c}_w=(\mathbf {c}_{w,0},\mathbf {c}_{w,1})\in \mathbb {Z}_2^n\times \mathbb {Z}_2^n\).

    2. 2.

      Next, it encrypts the user’s identity j under the key \(\mathsf {pk}_{\mathsf {OA}}=(\mathbf {G}_{\mathsf {oa},0},\mathbf {G}_{\mathsf {oa},1})\).

      • Let \(\mathsf {bin}(j) = [j_1|\ldots | j_\ell ]^{\top } \in \{0,1\}^\ell \).

      • Sample randomnesses \(\mathbf {r}_{\mathsf {oa},0},\mathbf {r}_{\mathsf {oa},1} \xleftarrow {\$} \mathbb {Z}_2^{k_1-\ell }\) and noises \(\mathbf {e}_{\mathsf {oa},0},\mathbf {e}_{\mathsf {oa},1} \xleftarrow {\$}\mathsf {B}(n,t_1)\).

      • For \(b\in \{0,1\}\), compute

        $$\begin{aligned} \mathbf {c}_{\mathsf {oa},b}= \mathbf {G}_{\mathsf {oa},b}\cdot \left( \begin{array}{c} \mathbf {r}_{\mathsf {oa},b} \\ \mathsf {bin}(j) \\ \end{array} \right) \oplus \mathbf {e}_{\mathsf {oa},b} \in \mathbb {Z}_2^{n}. \end{aligned}$$
        (11)

      Let \(\mathbf {c}_{\mathsf {oa}}=(\mathbf {c}_{\mathsf {oa},0}, \mathbf {c}_{\mathsf {oa},1})\in \mathbb {Z}_2^{n}\times \mathbb {Z}_2^n\).

    3. 3.

      It then generates a proof showing that \(\mathbf {c}_{w,0},\mathbf {c}_{w,1}\) both encrypt \(\mathbf {w}\) and that \(\mathbf {c}_{\mathsf {oa},0}, \mathbf {c}_{\mathsf {oa},1}\) both encrypt \(\mathsf {bin}(j)\). The proof employs a Stern-like interactive \(\mathsf {ZK}\) protocol on public input \((\mathbf {G}_{\mathsf {oa},0},\mathbf {G}_{\mathsf {oa},1},\mathbf {c}_{w},\mathbf {c}_{\mathsf {oa}},L)\) and secret input \((\mathbf {G}_{j,0},\mathbf {G}_{j,1},\mathbf {w},\mathsf {bin}(j),\mathbf {r}_{w,0},\mathbf {r}_{w,1},\mathbf {e}_{w,0},\mathbf {e}_{w,1}, \mathbf {r}_{\mathsf {oa},0},\mathbf {r}_{\mathsf {oa},1},\mathbf {e}_{\mathsf {oa},0},\mathbf {e}_{\mathsf {oa},1})\), described in detail in the full version. The interactive protocol is repeated \(\kappa \) times to achieve negligible soundness error and made non-interactive via Fiat-Shamir transform [21]. The resulting proof is a triple of form \(\pi _{ct}=(\{\mathsf {CMT}_{ct,i}\}_{i=1}^{\kappa }, \mathsf {Ch}_{ct}, \{\mathsf {RSP}_{ct,i} \}_{i=1}^{\kappa })\) such that

      $$\mathsf {Ch}_{ct}=\mathcal {H}_{\mathsf {FS}} (\{\mathsf {CMT}_{ct,i}\}_{i=1}^{\kappa },\mathbf {G}_{\mathsf {oa},0},\mathbf {G}_{\mathsf {oa},1},\mathbf {c}_{w},\mathbf {c}_{\mathsf {oa}},L).$$

    Output the ciphertext \(\psi = (\mathbf {c}_{w,0}, \mathbf {c}_{w,1}, \mathbf {c}_{\mathsf {oa},0}, \mathbf {c}_{\mathsf {oa},1},\pi _{ct})\) and coins

    $$\begin{aligned} \mathsf {coins}_{\psi }=(\mathbf {r}_{w,0},\mathbf {r}_{w,1},\mathbf {e}_{w,0},\mathbf {e}_{w,1}, \mathbf {r}_{\mathsf {oa},0},\mathbf {r}_{\mathsf {oa},1},\mathbf {e}_{\mathsf {oa},0},\mathbf {e}_{\mathsf {oa},1}). \end{aligned}$$
    (12)
  • \(\mathcal {P}\big (\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },S,\psi ,L,\mathbf {w},\mathsf {pk} ,\mathsf {coins}_{\psi }\big )\) Let \(\mathsf {coins}_{\psi }\) be of the form (12) and \(\psi =(\mathbf {c}_{w,0},\mathbf {c}_{w,1},\mathbf {c}_{\mathsf {oa},0},\mathbf {c}_{\mathsf {oa},1},\pi _{ct})\). This algorithm is implemented by the sender above who has encrypted a message \(\mathbf {w}\) to a user j at time epoch \(\tau \). The sender extracts \(\mathbf {B}\) from \(\mathsf {pp}\). In addition to the witness \(w^{(j)}\), he downloads \(\mathbf {u}_{\tau }\) as well from \(\mathsf {info}_{\tau }\). The goal of the sender is to convince the verifier in zero-knowledge that the following conditions hold.

    1. 1.

      The secret message \(\mathbf {w}\in \mathbb {Z}_2^p\) is such that \((S,\mathbf {w})\in R_{\mathrm {permit}}\).

    2. 2.

      The user encryption key \(\mathsf {pk}_j'\) is correctly hashed to a non-zero value \(\mathbf {d}_j\). In other words, \(\mathsf {Accu}_{\mathbf {B}}(\mathsf {pk}_j')=\mathbf {d}_j\) and \(\mathbf {d}_j\ne \mathbf {0}^n\).

    3. 3.

      The non-zero hash value \(\mathbf {d}_j\) is honestly accumulated to value \(\mathbf {u}_{\tau }\) at epoch \(\tau \), i.e., the equation \(\mathsf {Verify}_{\mathbf {B}}(\mathbf {u}_{\tau },\mathbf {d}_j,w^{(j)})=1\) holds.

    4. 4.

      \((\mathbf {c}_{w,0},\mathbf {c}_{w,1}), (\mathbf {c}_{\mathsf {oa},0},\mathbf {c}_{\mathsf {oa},1})\) are honest encryptions of \(\mathbf {w}\) and \(\mathsf {bin}(j)\), respectively. In other words, for \(b\in \{0,1\}\), Eqs. (10) and (11) hold.

    5. 5.

      The randomnesses \(\mathbf {r}_{w,0},\mathbf {r}_{w,1}, \mathbf {r}_{\mathsf {oa},0},\mathbf {r}_{\mathsf {oa},1}\) are binary vectors while noises \( \mathbf {e}_{w,0},\mathbf {e}_{w,1}\) and \(\mathbf {e}_{\mathsf {oa},0},\mathbf {e}_{\mathsf {oa},1}\) are in the sets \(\mathsf {B}(n,t_2)\) and \(\mathsf {B}(n,t_1)\), respectively.

    The proof employs a Stern-like interactive \(\mathsf {ZK}\) protocol on public input \(\big (\mathbf {B},\mathsf {pk}_{\mathsf {OA}}, \mathbf {u}_{\tau },S,\psi ,L\big )\) and secret input \((\mathbf {w},\mathsf {pk},\mathsf {coins}_{\psi },w^{(j)})\), provided in the full version. To achieve negligible soundness error, the protocol is repeated \(\kappa \) times. Then the Fiat-Shamir heuristic [21] is applied. The resulting proof is a triple \(\pi _{\psi }=(\{\mathsf {CMT}_i\}_{i=1}^{\kappa }, \mathsf {Ch},\{\mathsf {RSP}_i\}_{i=1}^{\kappa })\) where

    $$\begin{aligned} \mathsf {Ch}= \mathcal {H}_{\mathsf {FS}}(\{\mathsf {CMT}_i\}_{i=1}^{\kappa },\mathbf {B},\mathsf {pk}_{\mathsf {OA}}, \mathbf {u}_{\tau },S,\psi ,L)\in \{1,2,3\}^{\kappa }. \end{aligned}$$
  • \(\mathcal {V}\big ((\mathsf {pp},\mathsf {pk}_{\mathsf {GM}},\mathsf {pk}_{\mathsf {OA}}, \mathsf {info}_{\tau },S,\psi ,L),\pi _{\psi }\big )\) This algorithm verifies the legitimacy of the ciphertext label pair \((\psi ,L)\) with respect to epoch \(\tau \) and the set of keywords S by checking the validity of the proof \(\pi _{\psi }\). It proceeds as follows.

    1. 1.

      Download \(\mathbf {u}_{\tau }\) from \(\mathsf {info}_{\tau }\).

    2. 2.

      Parse \(\pi _{\psi }=(\{\mathsf {CMT}_i\}_{i=1}^{\kappa }, \mathsf {Ch},\{\mathsf {RSP}_i\}_{i=1}^{\kappa })\).

    3. 3.

      If \(\mathsf {Ch}=[\mathsf {ch}_1|\cdots |\mathsf {ch}_{\kappa }]^{\top }\ne \mathcal {H}_{\mathsf {FS}}(\{\mathsf {CMT}_i\}_{i=1}^{\kappa },\mathbf {B},\mathsf {pk}_{\mathsf {OA}}, \mathbf {u}_{\tau },S,\psi ,L)\), return 0.

    4. 4.

      For \(i\in [1,\kappa ]\), verify the validity of \(\mathsf {RSP}_i\) with respect to the commitment \(\mathsf {CMT}_i\) and the challenge \(\mathsf {ch}_i\). If any of the verifications does not hold, return 0. Else return 1.

  • \(\mathsf {Dec}(\mathsf {info}_{\tau },\mathsf {sk},\psi ,L)\) This algorithm is run by a user j with secret key \(\mathsf {sk}\). Parse \(\mathsf {sk}=(\mathsf {sk}_{\mathsf {ME}}^{(j,0)}, \mathsf {sk}_{\mathsf {ME}}^{(j,1)})\). It performs the following steps.

    1. 1.

      Parse \(\psi = (\mathbf {c}_{w,0}, \mathbf {c}_{w,1}, \mathbf {c}_{\mathsf {oa},0}, \mathbf {c}_{\mathsf {oa},1},\pi _{ct})\). It verifies the validity of \(\pi _{ct}\) as follows.

      • Let \(\pi _{ct}=(\{\mathsf {CMT}_{ct,i}\}_{i=1}^{\kappa }, \mathsf {Ch}_{ct}, \{\mathsf {RSP}_{ct,i} \}_{i=1}^{\kappa })\).

      • If \(\mathsf {Ch}_{ct}\ne \mathcal {H}_{\mathsf {FS}} (\{\mathsf {CMT}_{ct,i}\}_{i=1}^{\kappa },\mathbf {G}_{\mathsf {oa},0},\mathbf {G}_{\mathsf {oa},1},\mathbf {c}_{w,0},\mathbf {c}_{w,1}, \mathbf {c}_{\mathsf {oa},0},\mathbf {c}_{\mathsf {oa},1}, L)\), return \(\bot \). Otherwise, let \(\mathsf {Ch}_{ct}=[\mathsf {ch}_{ct,1}|\cdots |\mathsf {ch}_{ct,\kappa }]^{\top }\).

      • For \(i\in [1,\kappa ]\), verify the validity of \(\mathsf {RSP}_{ct,i}\) with respect to the commitment \(\mathsf {CMT}_{ct,i}\) and the challenge \(\mathsf {ch}_{ct,i}\). If any of the verifications does not hold, return \(\bot \).

    2. 2.

      If the above step does not return 0, it then runs the McElice decryption algorithm \(\mathsf {Dec}_{\mathsf {ME}}(\mathsf {sk}_{\mathsf {ME}}^{(j,0)},\mathbf {c}_{w,0})\) (see the full version), obtaining \(\mathbf {w}'\).

    3. 3.

      If \((S,\mathbf {w}') \in R_{\mathrm {permit}}\), return \(\mathbf {w}'\). Otherwise, return \(\bot \).

  • \(\mathsf {Open}(\mathsf {info}_{\tau },\mathsf {sk}_{\mathsf {OA}}, \psi ,L)\) This algorithm is run by the \(\mathsf {OA}\) who possesses the key \(\mathsf {sk}_{\mathsf {OA}}= (\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},0)},\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},1)})\). It proceeds as follows.

    1. 1.

      Parse \(\psi = (\mathbf {c}_{w,0}, \mathbf {c}_{w,1}, \mathbf {c}_{\mathsf {oa},0}, \mathbf {c}_{\mathsf {oa},1},\pi _{ct})\). It verifies \(\pi _{ct}\) as in the algorithm \(\mathsf {Dec}\). It \(\pi _{ct}\) is invalid, it returns \(\bot \).

    2. 2.

      Otherwise, it runs the decryption algorithm \(\mathsf {Dec}_{\mathsf {ME}}(\mathsf {sk}_{\mathsf {ME}}^{(\mathsf {oa},0)},\mathbf {c}_{\mathsf {oa},0})\), obtaining \([j_1'|\cdots |j_{\ell }']^{\top }\).

    3. 3.

      If \(\mathsf {info}_{\tau }\) does not include a witness containing the string \([j_1'|\cdots |j_{\ell }']^{\top }\), then return \(\perp \).

    4. 4.

      Let \(j' \in [0,N-1]\) be the integer that has binary representation \([j'_1| \cdots |j'_\ell ]^{\top }\). Output \(j'\).

6.2 Asymptotic Efficiency, Correctness, and Security

Efficiency. We now analyze the efficiency of our construction with respect to the security parameter \(\lambda \).

  • The public key and secret key of \(\mathsf {GM}\) have bit size \(\mathcal {O}(\lambda )\).

  • The public key and secret key of \(\mathsf {OA}\) and each user have bit size \(\mathcal {O}(\lambda ^2)\).

  • At each epoch, the sender who runs the \(\mathcal {P}\) algorithm needs to download data of bit size \(\mathcal {O}(\lambda \cdot \ell )\) while the verifier who runs the \(\mathcal {V}\) algorithm needs to download data of bit size \(\mathcal {O}(\lambda )\).

  • The size of ciphertext \(\psi \) is \(\mathcal {O}(\lambda ^2)\) and size of proof \(\pi _{\psi }\) is \(\omega (\log \lambda )\cdot \mathcal {O}(\lambda ^2+\ell \cdot \lambda )\).

Correctness. The above \(\mathsf {FDGE}\) scheme is correct with all but negligible probability. It relies on the following three facts: (a) the correctness of the underlying McEliece encryption scheme and (b) the perfect completeness of the zero-knowledge argument used in the \(\mathsf {Enc}\) algorithm and (c) the perfect completeness of the zero-knowledge argument used in the \(\mathcal {P}\) algorithm. Therefore, in \(\mathbf {Expt}_{\mathcal {A}}^{\mathsf {corr}}(1^{\lambda })\) defined in Sect. 5, the \(\mathcal {V}\) algorithm will output 1 by fact (c), and the \(\mathsf {Dec}\) and \(\mathsf {Open}\) algorithms will output \(\mathbf {w}'=\mathbf {w}\) and \(\mathsf {pk}'=\mathsf {pk}\), respectively, by fact (a) and (b).

Security. In Theorem 1, we prove the given \(\mathsf {FDGE}\) satisfies the proposed security requirements in Sect. 5.1.

Theorem 1

Assume the zero-knowledge argument used in the \(\mathsf {Enc}\) algorithm is simulation-sound and zero-knowledge, the zero-knowledge argument used in the \(\mathcal {P}\) algorithm is sound and zero-knowledge, the randomized McEliece encryption schemes have pseudorandom ciphertexts, and the hash function \(h_{\mathbf {B}}\) is collision resistant. Then, in the random oracle model, the above \(\mathsf {FDGE}\) scheme satisfies message secrecy, anonymity, and soundness.

Due to space limit, details of the proof are provided in the full version.