**Authors**
Michèle Wigger
Mireille Sarkiss
Mustapha Hamad

**License**
CC-BY-4.0

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 4255 Multi-Hop Network With Multiple Decision Centers Under Expected-Rate Constraints Mustapha Hamad , Member, IEEE, Michèle Wigger , Senior Member, IEEE, and Mireille Sarkiss , Member, IEEE Abstract— We consider a multi-hop distributed hypothesis aim at accurately detecting hazardous events or anomalies testing problem with multiple decision centers (DCs) for testing at the decision centers (DCs) by collecting data about the against independence and where the observations obey some measurements at the various sensors. The different events Markov chain. For this system, we characterize the fundamental type-II error exponents region, i.e., the type-II error exponents can be considered as different hypotheses and are assumed that the various DCs can achieve simultaneously, under expected- to determine the joint probability distribution underlying the rate constraints. Our results show that this fundamental expo- data observed at all the terminals. Our focus will be on binary nents region is boosted compared to the region under maximum- hypothesis testing, i.e., situations with only two possible rate constraints, and that it depends on the permissible type-I events, with one of the two events corresponding to the normal error probabilities. When all DCs have equal permissible type-I error probabilities, the exponents region is rectangular and situation, the so called null hypothesis and the other to an all DCs can simultaneously achieve their optimal type-II error alert situation the so called alternative hypothesis. There are exponents. When the DCs have different permissible type-I error two types of errors to distinguish here: type-I error and type-II probabilities, a tradeoff between the type-II error exponents error. Type-I error corresponds to a false alarm where the deci- at the different DCs arises. New achievability and converse sion center decides on the alternative hypothesis when the true proofs are presented. For the achievability, a new multiplexing and rate-sharing strategy is proposed. The converse proof is hypothesis is the null hypothesis. Type-II error corresponds to based on applying different change of measure arguments in a missed detection where the decision center decides on the parallel and on proving asymptotic Markov chains. For the null hypothesis when the true one is the alternative hypothesis. special cases K ∈ {2, 3}, and for arbitrary K ≥ 2 when Since our interest is in alert systems where a missed detection all permissible type-I error probabilities at the various DCs are is more critical, we aim at maximizing the exponential decay equal, we provide simplified expressions for the exponents region; a similar simplification is conjectured for the general case. of the type-II error probability (called error exponent) while only requiring the type-I error probability to stay below a given Index Terms— Multi-hop, distributed hypothesis testing, error threshold. exponents, expected-rate constraints, variable-length coding. Most of the information theoretic works studied the dis- I. I NTRODUCTION tributed binary hypothesis testing problem with a single sensor that communicates with a single distant DC over a noise-free F UTURE wireless systems are driven by the exponential growth of IoT networks and applications with various requirements in terms of rate, reliability, and energy consump- link with a constraint on the maximum allowed communica- tion rate [3], [4], [5], [6], [7], [8], [9], [10]. These results tion. In applications such as health monitoring, security alert- were also extended to setups with noisy communication links ing or automotive car control, the sensing and decision systems [11], [12], [13], to setups with privacy and secrecy constraints [14], [15], and to more complicated networks with either Manuscript received 5 July 2022; revised 1 December 2022; accepted interactive communication [16], [17], [18], [19], multiple sen- 30 December 2022. Date of publication 23 January 2023; date of current version 16 June 2023. The work of Mustapha Hamad and Michèle Wig- sors [8], [20], [21], multiple decision centers [22], [23], [24], ger was supported by the European Research Council (ERC) under the [25], [26], or both of them [27], [28], [29]. The works most European Union’s Horizon 2020 Research and Innovation Programme under closely related to this paper are [28] and [29] which considered Grant 715111. An earlier version of this paper was presented in part at the IEEE Information Theory Workshop (ITW), 2021 [DOI: a multi-hop setup with K sensors and K DCs. Multi-hop 10.1109/ITW48936.2021.9611470] and in part at the IEEE Globe- setups are motivated by the stringent energy constraints of IoT com Conference, 2021 [DOI: 10.1109/GLOBECOM46510.2021.9685750]. devices requiring short-range communication only between (Corresponding author: Mustapha Hamad.) Mustapha Hamad was with LTCI, Télécom Paris, Institut Polytech- neighbouring sensors. nique de Paris, 91120 Palaiseau, France. He is now with the Huawei Specifically, [28] characterized a set of type-II error expo- Mathematical and Algorithmic Sciences Laboratory, Paris Research Center, nent tuples that are simultaneously achievable at the various 92100 Boulogne-Billancourt, France (e-mail: mustapha.hamad@huawei.com). Michèle Wigger is with LTCI, Télécom Paris, Institut Polytechnique de DCs in a multi-hop network with K sensors and K DCs. For Paris, 91120 Palaiseau, France (e-mail: michele.wigger@telecom-paris.fr). the special case of testing against independence and when Mireille Sarkiss is with SAMOVAR, Télécom SudParis, Institut Poly- the type-I error probabilities at all the DCs are required to technique de Paris, 91120 Palaiseau, France (e-mail: mireille.sarkiss@ telecom-sudparis.eu). vanish asymptotically, this set of exponents coincides with the Communicated by P.-N. Chen, Associate Editor for Shannon Theory and fundamental exponents region, which means that in this special Information Measures. case no other exponent tuples are achievable. Testing against Color versions of one or more figures in this article are available at https://doi.org/10.1109/TIT.2023.3238339. independence refers to a hypothesis test where under the Digital Object Identifier 10.1109/TIT.2023.3238339 alternative hypothesis the observations at the various terminals This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 4256 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 follow the product of the marginal distributions that they coding and testing scheme when the maximum rates are experience under the null hypothesis. The result in [28] further limited by the chosen rate-tuple. For K = 2 and K = required that the joint distribution of the various observations 3, we explicitly characterize the multiplexing probabilities under the null hypothesis satisfies certain Markov chains from in function of the type-I error probability thresholds at the one relay to the other. Interestingly, in this case, the set of various DCs and we show that one can restrict to only exponent tuples that are simultaneously achievable at the K K +1 subschemes, instead of 2K . We conjecture that a similar decision centers is a K-dimensional hypercube, implying that simplification holds for arbitrary K ≥ 2. no tradeoff between the exponents arises and each DC can Our converse proofs apply several instances of the change achieve the optimal exponent as if it was the only DC in of measure arguments in [34], [35], and [36] in parallel, the system. When K = 2, [29] proved the strong converse where we also restrict to jointly typical source sequences as result that the optimal exponent region does not depend on in [35]. In contrast to the related strong converse proofs in [29] the permissible type-I error probabilities. (The work in [29] and [34] no variational characterizations, or hypercontractivity presents different expressions for the sets of achievable type-II arguments [37] are required to prove our desired results. error exponents depending on whether the sum of the two Instead, we rely on arguments showing that certain Markov admissible type-I error probabilities exceeds 1 or not. It can chains hold in an asymptotic regime of infinite blocklengths. however be shown that the two expressions coincide, as we Notice that our method to circumvent variational characteri- argue in Remark 1.) zations, or hypercontractivity, or blowing-up arguments [38], Above works all focused on maximum-rate constraint where seems to extend also to other converse proofs, see for example the length of any message sent over the communication link the simplified proof of the well-known strong converses for is limited. In this paper, we consider expected-rate constraints lossless and lossy compression with side-information at the as in [10], [30], [31], and [32], where the expected length of decoder [39], [40] presented in [41]. the message sent over the communication link is constrained. We summarize our main contributions for the K ≥ 2-hop Most closely related are the works in [30] and [33] which network with K DCs that test against independence and when showed that under an expected-rate constraint R, the optimal the observations at the terminals obey a specific Markov chain: type-II error exponent for testing against independence in • We provide an exact characterization of the general the single-sensor and single-DC setup coincides with the fundamental exponents region under expected-rate con- optimal type-II error exponent under a maximum-rate con- straints. This result shows rate-boosts on all the links in straint R/(1 − ϵ), for ϵ denoting the permissible type-I error the system, and illustrates a tradeoff between the expo- constraint. In other words, the relaxed expected-rate constraint nents at all DCs with different type-I error thresholds. seems to allow to boost the rate by a factor (1−ϵ)−1 compared • To prove achievability, we propose a new coding scheme to the same setup under a maximum-rate constraint. based on multiplexing and rate-sharing strategy. We show that the same conclusion holds for the K-hop • Converses are proved by several parallel change of mea- network with K decision centers considered in [28] when sure arguments and by showing certain Markov chains in all DCs obey the same type-I error constraint ϵ. In this case, the asymptotic regime of infinite blocklengths. the fundamental exponents region is a K-dimensional hyper- • We prove that our results simplify for the special cases cube where all DCs can simultaneously achieve their optimal of K = 2 or K = 3 hops or when all K DCs have type-II error exponents as if they were the only DC in the same admissible type-I error probabilities ϵ. For equal system, and this exponent coincides with the exponent under a type-I error probabilities, the optimal scheme multiplexes maximum-rate constraint but where the rates of all links in the only two subschemes instead of 2K . For general type-I system are boosted by a factor (1−ϵ)−1 . In contrast, when the error probabilities and when there are only K = 2 or various DCs have different type-I error probability thresholds, K = 3 hops, then at most K + 1 subschemes need to be a tradeoff arises between the type-II error exponents that are multiplexed. Multiplexing probabilities in these special simultaneously achievable at the different DCs. This tradeoff, cases can directly be obtained from the permissible type-I which depends on the type-I error thresholds at the different error probabilities at the various DCs. Similar simplifica- DCs, is the first of its kind and we exactly characterize it tions are conjectured for arbitrary K ≥ 2 and arbitrary for the studied multi-hop setup. We notice hence that under type-I error probabilities. expected-rate constraint a strong converse does not hold, since the optimal type-II error exponents depend on the admissible Paper organization: The remainder of this paper is divided type-I error probabilities. We show this for arbitrary K ≥ 2 into two main parts, one focusing on the two-hop network and arbitrary admissible type-I error probabilities ϵ1 , . . . , ϵK (Sections II–VI) and one considering the general K-hop by fully characterizing the fundamental type-II error exponents network (Sections VII–VIII). For the first part, Section II region. describes the two-hop system model, and Section III presents To prove our achievability results under expected-rate con- the related previous results under maximum-rate constraints. straints, we propose a new multiplexing and rate-sharing Section IV explains and analyses our proposed optimal cod- strategy that generalizes the degenerate multiplexing scheme ing schemes for the setup under expected-rate constraints. in [30]. Specifically, we multiplex different coding schemes of Section V contains our main results, discussion, and numerical different sets of rates on the various links and with different analysis for the two-hop network. In Section VI, we provide probabilities, where each multiplexed subscheme is an optimal our converse proof. For the second part, Section VII introduces HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4257 the system model for K hops, presents the related previous results on maximum-rate constraints. It also describes our new optimal coding scheme and the fundamental exponents region under expected-rate constraints, and simplifications on them. The converse for K-Hops is presented in Section VIII. Notation: We follow the notation in [30]. In particular, we use sans serif font for bit-strings: e.g., m for a deterministic Fig. 1. Cascaded two-hop setup with two decision centers. and M for a random bit-string. We let bin(m) denote the shortest bit-string representation of a positive integer m, and for given pmfs PY0 Y1 and PY2 |Y1 and where PY0 , PY1 , and PY2 for any bit-string m we let len(m) and dec(m) denote its length denote the marginals of the joint pmf PY0 Y1 Y2 := PY0 Y1 PY2 |Y1 . and its corresponding positive integer. The set of all bit-strings In this two-hop setup, the transmitter T0 observes the source (n) is denoted by {0, 1}⋆ . sequence Y0n and sends its bit-string message M1 = ϕ0 (Y0n ) Random variables are typically written with upper case (n) to R1 , where the encoding function is of the form ϕ0 : Y0n → symbols, e.g., A, and realizations in lower case, e.g., a. {0, 1}⋆ and satisfies the expected-rate constraint Sets show in calligraphic symbols, for example A, and An denotes the n-fold Cartesian product of A, for any positive E [len (M1 )] ≤ nR1 . (4) integer n. We further abbreviate the random and deterministic The relay R1 observes the source sequence Y1n and with the n-tuples (A1 , . . . , An ) and (a1 , . . . , an ) by An and an . For message M1 received from T0 , it produces a guess Ĥ1 of the given integer K, we use P(K) to denote the power set of (n) hypothesis H using a decision function g1 : Y1n × {0, 1}⋆ → {1, . . . , K}, i.e., the set of all possible subsets of {1, . . . , K} {0, 1}: including the set {1, . . . , K} itself, but excluding the emptyset (n) Ĥ1 = g1 (Y1n , M1 ) ∈ {0, 1}. (5) (n) ∅. In addition, Tµ (PXY ) denotes the strongly typical set as defined in [42, Definition 2.8]. That means, for any small Relay R1 also computes a bit-string message M2 = (n) (n) (n) positive number µ a pair (xn , y n ) lies in Tµ (PXY ) if, and ϕ1 (Y1n , M1 ) using some encoding function ϕ1 : Y1n × ⋆ ⋆ only if, {0, 1} → {0, 1} that satisfies the expected-rate constraint |{t : (xt , yt )}| E [len (M2 )] ≤ nR2 . (6) − PXY (a, b) ≤ µ, ∀(a, b) ∈ X × Y, (1) n Then it sends M2 to the receiver R2 , which guesses hypothesis H using its observation Y2n and the received message M2 , and for all pairs (a, b) of 0 probability (xt , yt ) ̸= (a, b) for all (n) i.e., using a decision function g2 : Y2n × {0, 1}⋆ → {0, 1}, t ∈ {1, . . . , n}. it produces the guess: Throughout this manuscript, hb (·) denotes the binary (n) entropy function, and D(·∥·) the Kullback-Leibler divergence Ĥ2 = g2 (Y2n , M2 ) ∈ {0, 1}. (7) between two probability mass functions on the same alpha- The goal is to design encoding and decision functions such bet. Entropy, conditional entropy, and mutual information of that their type-I error probabilities random variables are denoted by H(·), H(·|·), and I(·; ·). When the probability mass functions of the involved random α1,n ≜ Pr[Ĥ1 = 1|H = 0] (8) variables are not clear from the context, we add them as a α2,n ≜ Pr[Ĥ2 = 1|H = 0] (9) subscript and write for example HP (·), HP (·|·), and IP (·; ·). We use the symbols lim and lim to denote the limsup and the stay below given thresholds ϵ1 > 0 and ϵ2 > 0 and the type-II liminf of sequences. 1{·} denotes the indicator function and error probabilities for any real value x we write [x]+ for max{0, x}. β1,n ≜ Pr[Ĥ1 = 0|H = 1] (10) Finally, we abbreviate the terms independent and identically β2,n ≜ Pr[Ĥ2 = 0|H = 1] (11) distributed by i.i.d. and probability mass function by pmf. decay to 0 with largest possible exponential decay. II. T HE T WO -H OP S YSTEM M ODEL Definition 1: Fix maximum type-I error probabilities (ϵ1 , ϵ2 ) ∈ [0, 1)2 and nonnegative rates (R1 , R2 ). The Consider the distributed hypothesis testing problem in exponent pair (θ1 , θ2 ) is called (R1 , R2 , ϵ1 , ϵ2 )-achievable if Figure 1 with a transmitter T0 , a relay R1 and a receiver R2 there exists a sequence of encoding and decision functions observing sequences Y0n , Y1n and Y2n respectively, forming the (n) (n) (n) (n) {ϕ0 , ϕ1 , g1 , g2 }n≥1 satisfying ∀i ∈ {1, 2}: Markov chain Y0n → Y1n → Y2n (2) E[len(Mi )] ≤ nRi , (12a) lim αi,n ≤ ϵi , (12b) In the special case of testing against independence, i.e., n→∞ depending on the binary hypothesis H ∈ {0, 1}, the tuple 1 1 lim log ≥ θi . (12c) (Y0n , Y1n , Y2n ) is distributed as: n→∞ n β i,n Definition 2: The closure of the set of all (ϵ1 , ϵ2 )- under H = 0 : (Y0n , Y1n , Y2n ) i.i.d. ∼ PY0 Y1 · PY2 |Y1 ; (3a) achievable exponent pairs (θ1 , θ2 ) is called the fundamental under H = 1 : (Y0n , Y1n , Y2n ) i.i.d. ∼ PY0 · PY1 · PY2 (3b) (ϵ1 , ϵ2 )-exponents region and is denoted E ∗ (R1 , R2 , ϵ1 , ϵ2 ). 4258 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 III. P REVIOUS R ESULTS ON M AXIMUM -R ATE where T0 is not present [3]. In the studied two-hop setup, R2 C ONSTRAINTS FOR T WO H OPS thus accumulates the optimal exponents achieved over the two A. The Setup links. Since the exponents region is a rectangle, each of the two decision centers, R1 and R2 , can simultaneously achieve The multi-hop hypothesis testing setup of Figure 1 and their optimal exponents, no tradeoff occurrs between the two Equations (3) was also considered in [27] and [29], but under exponents. We shall see that this is not always the case under maximum-rate constraints: expected-rate constraints. len(Mi ) ≤ nRi , i ∈ {1, 2}, (13) IV. O PTIMAL T WO -H OP C ODING S CHEME U NDER instead of the expected-rate constraints (12a). The fun- E XPECTED -R ATE C ONSTRAINTS ∗ damental exponents region Emax (R1 , R2 , ϵ1 , ϵ2 ) for this The optimal coding scheme under expected-rate constraints maximum-rate setup is defined analogously to Definition (2), depends on whether ϵ1 = ϵ2 , ϵ1 < ϵ2 , or ϵ1 > ϵ2 . The general but with (12a) replaced by (13). idea of all the three schemes is that the three terminals T0 , In the following subsection, we report the fundamental ∗ R1 , R2 multiplex two or three different subschemes, and the exponents region Emax (R1 , R2 , ϵ1 , ϵ2 ) derived in [29]. choice of which subscheme to use depends on the transmitter T0 ’s observations y0n . To inform all terminals about the choice B. The Exponents Region of the subscheme, T0 adds one or two flag bits to its message, Define the two functions which the relay R1 forwards to the receiver R2 . The main distinguishing feature of the different subschemes η1 (R1 ) := max I (U1 ; Y1 ) (14) PU1 |Y0 : is the choice of the subset of terminals—no terminal, both R1 ≥I(U1 ;Y0 ) terminals, only R1 or only R2 –that exploit the information η2 (R2 ) := max I (U2 ; Y2 ) , (15) in the transmitted messages to produce a guess of hypothesis PU2 |Y1 : R2 ≥I(U2 ;Y1 ) H. The other terminals ignore this communication and simply declare Ĥ = 1. The different subschemes occupy different where the mutual information quantities are calculated with communication rates, and as we shall see in the following respect to the joint pmfs PU1 Y0 Y1 := PU1 |Y0 PY0 Y1 and Section V, the allocation of the rates has to be chosen in func- PU2 Y1 Y2 := PU2 |Y1 PY1 Y2 , respectively. As stated in [3], tion of the desired tradeoff between the exponents θ1 and θ2 . in the above maximization problems it suffices to consider In this section, we formulate the subschemes based on generic auxiliary random variables U1 and U2 over alphabets of sizes hypothesis testing schemes for the two-hop network and the |Y0 | + 1 and |Y1 | + 1. single-hop network with vanishing type-I error probabilities Lemma 1: The functions η1 and η2 are continuous, concave and respecting given rate constraints. Replacing these generic and monotonically non-decreasing on their entire domain R+ 0. schemes by the optimal schemes under maximum-rate con- Proof: Appendix A proves the desired properties for η1 . straints [4], [27] attains the optimal error exponents presented The proof for η2 is analogous and omitted. in Theorem 2 ahead. The exponents region of the two-hop setup under maximum- rate constraints was determined in [29] for the case ϵ1 + ϵ2 ̸= A. The Case ϵ1 = ϵ2 = ϵ 1 and in [41] for the case ϵ1 + ϵ2 = 1. Achievability was first established in [27]. We combine two subschemes, where in one subscheme both Theorem 1 ( [29], [41]): Fix (ϵ1 , ϵ2 ) ∈ [0, 1)2 . The fun- R1 and R2 attempt to correctly guess the hypothesis H and damental exponents region under the maximum-rate con- in the other subscheme both simply declare Ĥ = 1. To this straints (13) is: end, we partition the set Y0n into subsets D∅ , D{1,2} ⊆ Y0n so ∗ that under PYn0 the probability of subset D{1,2} is as large as Emax (R1 , R2 , ϵ1 , ϵ2 ) possible but satisfies = {(θ1 , θ2 ) : θ1 ≤ η1 (R1 ) , θ2 ≤ η1 (R1 )+η2 (R2 )}. (16) Pr Y0n ∈ D{1,2} ≤ 1 − ϵ. (17) We notice that the fundamental exponents region does not Notice that as n → ∞ the inequality turns into an equality. depend on the permissible type-I error probabilities ϵ1 and ∗ Depending on whether Y0n lies in D∅ or D{1,2} , the three ϵ2 . We will therefore abbreviate Emax (R1 , R2 , ϵ1 , ϵ2 ) by ∗ terminals follow a different subscheme. Emax (R1 , R2 ). If Y0n ∈ D∅ : In this case, none of the terminals attempts to Remark 1: When ϵ1 +ϵ2 > 1, the work in [29] characterizes ∗ correctly guess the hypothesis H. Specifically, T0 and R1 both Emax (R1 , R2 ) in form of an optimization problem over three send auxiliary random variables U1 , U2 , V , see [29, Eq. (33)]. M1 = M2 = [0] (18) It can however be verified that without loss of optimality this optimization can be restricted to auxiliaries U1 = U2 , and R1 and R2 simply declare in which case the characterisation in [29, Eq. (33)] reduces to Ĥ1 = Ĥ2 = 1. (19) the expression in (16). Notice that η1 (R1 ) determines the optimal exponent in a If Y0n ∈ D{1,2} : In this case, both R1 and R2 attempt to point-to-point system where R2 is not present, and η2 (R2 ) correctly guess H based on the transmitted messages. Specifi- determines the optimal exponent in a point-to-point system cally, T0 , R1 , R2 all apply the encoding/decision functions of a HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4259 given two-hop hypothesis testing scheme with vanishing type- forwards this flag at the beginning of its message M2 to I error probabilities and respecting maximum-rate constraints inform R2 . R{1,2},1 and R{1,2},2 on the two links,1 where these rates are If Y0n ∈ D∅ : T0 and R1 send only the flag-bits chosen to satisfy M1 = M2 = [0, 0] (23) (1 − ϵ)R{1,2},1 ≤ R1 (20a) and R1 and R2 decide on (1 − ϵ)R{1,2},2 ≤ R2 . (20b) To inform all the terminals about the event Y0 ∈ D{1,2} Ĥ1 = Ĥ2 = 1. (24) and consequently about the employed scheme, T0 and R1 If Y0n ∈ D{1} : T0 and R1 apply a given single-hop hypothe- append the [1]-flag at the beginning of their messages sis testing scheme with vanishing type-I error probability and M1 and M2 . expected-rate constraint R{1},1 for message M1 . Moreover, Analysis: By (17) and (20), and because transmission of message M1 is preceded by flag-bits [1, 0], and the relay R1 single bits hardly changes the communication rate for large forwards these flag-bits to R2 : blocklengths, the overall scheme satisfies the expected-rate constraints R1 and R2 on the two links. Appendix B proves M2 = [1, 0]. (25) that when the optimal two-hop hypothesis testing scheme with vanishing type-I error probability [27] is employed for Upon reception of these flag-bits, R2 declares Y0n ∈ D{1,2} , then the overall scheme meets the permissible Ĥ2 = 1. (26) type-I error probability ϵ and achieves the error exponent given by Equation (31) of Theorem 2. We observe that, as indicated by the subscript {1} of set D{1} , only terminal R1 attempts to correctly guess H. Receiver B. The Case ϵ1 < ϵ2 R2 produces the trivial guess in (26) because of its higher admissible type-I error probability ϵ2 > ϵ1 . Notice also that We combine three subschemes, where in each subscheme no communication rate is required for message M2 in the limit either no terminal, only R1 , or both R1 and R2 attempt to as n → ∞. correctly guess H. To this end, we partition the set Y0n If Y0n ∈ D{1,2} : T0 , R1 , R2 apply a given two-hop into three disjoint subsets D∅ , D{1} , D{1,2} ⊆ Y0n so that hypothesis testing scheme with vanishing type-I error probabil- under PY0 the two sets D{1} and D{1,2} have largest possible ities and satisfying the expected-rate constraints R{1,2},1 and probabilities but limited by R{1,2},2 . Pr Y0n ∈ D{1} ≤ ϵ2 − ϵ1 Analysis: By (21) and (22), and because transmission of two (21a) n bits hardly changes the rate for sufficiently large blocklengths, Pr Y0 ∈ D{1,2} ≤ 1 − ϵ2 . (21b) the proposed overall scheme respects the expected-rate con- As a consequence, straints R1 and R2 for large values of n. Appendix C proves that when the optimal single-hop and two-hop hypothesis Pr [Y0n ∈ D∅ ] ≥ ϵ1 . (21c) testing schemes under maximum-rate constraints R{1},1 and (R{1,2},1 , R{1,2},2 ) with vanishing type-I error probability Notice that as n → ∞, the three inequalities (21) can hold [4], [27] are used, then the overall scheme satisfies the type-I with equality. error constraints ϵ1 and ϵ2 and achieves the error exponents Choose also nonnegative rates R{1},1 , R{1,2},1 , R{1,2},2 in Equation (32) of Theorem 2. satisfying (ϵ2 − ϵ1 )R{1},1 + (1 − ϵ2 )R{1,2},1 ≤ R1 (22a) C. The Case ϵ1 > ϵ2 (1 − ϵ2 )R{1,2},2 ≤ R2 . (22b) We combine three subschemes, where in each subscheme Depending on whether Y0n lies in D∅ , D{1} , or D{1,2} , either no terminal, only R2 , or both R1 and R2 attempt to the three terminals apply a different subscheme satisfying a correctly guess H. To this end, we partition the set Y0n different pair of maximum-rate constraints, where the sub- into three disjoint subsets D∅ , D{2} , D{1,2} ⊆ Y0n so that script I of set DI indicates the set of relays that attempt under PY0n the two sets D{2} and D{1,2} have largest possible to correctly guess H in the event Y0n ∈ DI . To commu- probabilities but limited by nicate which of the three subschemes is used, T0 adds a Pr Y0n ∈ D{2} ≤ ϵ1 − ϵ2 two-bit flag at the beginning of its message M1 to R1 , which (27a) n Pr Y0 ∈ D{1,2} ≤ 1 − ϵ1 . (27b) 1 As it will become clear in the subsequent analysis, for the overall scheme to respect rate constraints (4) and (6), it suffices that the two-hop scheme As a consequence, respects the rate constraints R{1,2},1 and R{1,2},2 on expectation. However, as a consequence of our main result in Theorem 2, under vanishing type-I Pr [Y0n ∈ D∅ ] ≥ ϵ2 . (27c) error probabilities, the same type-II error exponents are achievable under both expected- and maximum-rate constraints. There is thus no benefit in considering schemes with expected rates R{1,2},1 and R{1,2},2 , but possibly Notice that as n → ∞, the three inequalities (27) hold with larger maximum rates. equality. 4260 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 Choose also nonnegative rates R{2},1 , R{1,2},1 , R{2},2 , and Theorem 2: Given ϵ1 , ϵ2 , R1 , R2 ≥ 0. R{1,2},2 satisfying If ϵ1 = ϵ2 = ϵ, then E ∗ (R1 , R2 , ϵ, ϵ) is the set of all nonnegative (θ1 , θ2 ) pairs satisfying (ϵ1 − ϵ2 )R{1},1 + (1 − ϵ1 )R{1,2},1 ≤ R1 (28) (ϵ1 − ϵ2 )R{2},2 + (1 − ϵ1 )R{1,2},2 ≤ R2 . (29) θ1 ≤ η1 (R1 /(1 − ϵ)) (31a) θ2 ≤ η1 (R1 /(1 − ϵ)) + η2 (R2 /(1 − ϵ)). (31b) Depending on whether Y0n lies in D∅ , D{2} , or D{1,2} , the three terminals apply a different subscheme. The subscript I If ϵ1 < ϵ2 , then E ∗ (R1 , R2 , ϵ1 , ϵ2 ) is the set of all nonneg- of set DI again indicates the set of terminals that attempt ative (θ1 , θ2 ) pairs satisfying to correctly guess H in the event Y0n ∈ DI , and RI,1 , RI,2 θ1 ≤ min η1 R{1},1 , η1 R{1,2},1 (32a) indicate the maximum rates of the subscheme employed under Y0n ∈ DI . (An exception is the event Y0n ∈ D∅ , where θ2 ≤ η1 R{1,2},1 + η2 (R2 /(1 − ϵ2 )) , (32b) both rates are 0.) Flag-bits are used at the beginning of the for some rates R{1},1 , R{1,2},1 ≥ 0 so that messages M1 and M2 to inform R1 and R2 about which of the subschemes is employed. R1 ≥ (ϵ2 − ϵ1 )R{1},1 + (1 − ϵ2 )R{1,2},1 . (32c) If Y0n ∈ D∅ : All three terminals, T0 , R1 , and R2 apply the If ϵ1 > ϵ2 , then E ∗ (R1 , R2 , ϵ1 , ϵ2 ) is the set of all nonneg- degenerate scheme in (23)–(24). ative (θ1 , θ2 ) pairs satisfying If Y0n ∈ D{2} : As indicated by the subscript of set D{2} , only R2 makes a serious attempt to correctly guess H, while θ1 ≤ η1 (R{1,2},1 ) (33a) R1 always declares θ2 ≤ min η1 (R{1,2},1 ) + η2 R{1,2},2 , Ĥ1 = 1, (30) η1 R{2},1 + η2 R{2},2 (33b) irrespective of the received message and its observations. for some rates R{1,2},1 , R{2},1 , R{1,2},2 , R{2},2 ≥ 0, so that This implies that under this subscheme, α1,n = 1 and β1,n = 0. Besides this decision, T0 , R1 , and R2 apply a given R1 ≥ (ϵ1 − ϵ2 )R{2},1 + (1 − ϵ1 )R{1,2},1 (33c) two-hop distributed hypothesis testing scheme with vanishing R2 ≥ (ϵ1 − ϵ2 )R{2},2 + (1 − ϵ1 )R{1,2},2 . (33d) type-I error probabilities and respecting the maximum-rate constraints R{2},1 and R{2},2 for messages M1 and M2 . Proof: Achievability is based on the schemes in Moreover, both T0 and R1 append the two-bit flag [0,1] at the Section IV, see Appendices B and C for their analyses. The beginning of these two messages to inform all the terminals converse is proved in Section VI. about the employed scheme. Remark 2 (Discussion for ϵ1 = ϵ2 ): By above theorem, for Notice that in the optimal two-hop hypothesis testing ϵ1 = ϵ2 = ϵ, the fundamental exponents region scheme [27], the relay R1 computes a tentative decision E ∗ (R1 , R2 , ϵ, ϵ) is a rectangle. Also, compared to the fun- based on M1 and Y1n , which influences the message M2 damental exponents region under maximum-rate constraints, sent to R2 and allows the latter to improve its type-I error here the rates are boosted by a factor (1 − ϵ)−1 : probability. Here we propose that R1 itself ignores its tentative ∗ ∗ R1 R2 E (R1 , R2 , ϵ, ϵ) = Emax , . (34) decision, because the naive decision (30) is sufficient to satisfy (1 − ϵ) (1 − ϵ) the constraint ϵ1 on its type-I error probability and is also In particular, for ϵ1 = ϵ2 = 0 the fundamental expo- the most-favorable decision to maximize the type-II error nents regions under maximum- and expected-rate constraints exponent. coincide: If Y0n ∈ D{1,2} : Both decision centers R1 and R2 attempt E ∗ (R1 , R2 , 0, 0) = Emax ∗ (R1 , R2 ). (35) to correctly guess H. Specifically, T0 , R1 , and R2 apply a given two-hop hypothesis testing scheme with vanishing Remark 3 (Discussion for ϵ1 < ϵ2 ): For ϵ1 ̸= ϵ2 , the fun- type-I error probabilities and respecting the maximum-rate damental exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ) is not a rect- constraints R{1,2},1 and R{1,2},2 for messages M1 and M2 . angle, as can be verified by the numerical results in Moreover, both T0 and R1 append the two-bit flag [1,1] at the Figures 2, 3, and 4 in the next subsection. In fact, one beginning of these two messages to inform all the terminals observes a tradeoff between the two exponents θ1 and θ2 , about the employed scheme. which is driven by the choice of the rates RI,1 , RI,2 for Analysis: Similarly to the case ϵ1 < ϵ2 , it can be shown that I ∈ P(2), where P(2) is the power set of all subsets of {1, 2} the described scheme respects the expected-rate constraints (4) excluding the emptyset, i.e. P(2) = {{1}, {2}, {1, 2}}. More and (6) on both links, and that when the optimal two-hop specifically, for ϵ1 < ϵ2 the choice scheme [27] is employed, then the described scheme achieves R{1,2},1 = R1 /(1 − ϵ2 ) (36a) the error exponents in Equation (33) of Theorem 2. R{1},1 = 0 (36b) V. E XPONENTS R EGION FOR THE T WO -H OP N ETWORK maximizes exponent θ2 , which then evaluates to U NDER E XPECTED -R ATE C ONSTRAINTS θ2 = θ2,max := η1 (R1 /(1 − ϵ2 )) + η2 (R2 /(1 − ϵ2 )) , (37) The fundamental exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ) has a different form, depending on the three cases ϵ1 = ϵ2 , ϵ1 < ϵ2 , but completely degrades θ1 to θ1 = 0. (Notice that for large or ϵ1 > ϵ2 . R1 /(1 − ϵ2 ) above choice (36) might not be the unique HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4261 optimizer and other optimizers will still allow to attain a the two exponents θ1 and θ2 . In this view, notice that when the positive η1 .) focus is on maximizing θ2 , then for ϵ1 < ϵ2 one has to entirely On the other hand, the choice sacrifice θ1 , whereas for ϵ1 > ϵ2 positive θ1 -exponents are possible but the rate-boost experienced by θ1 is reduced from R{1},1 = R{1,2},1 = R1 /(1 − ϵ1 ) (38) (1 − ϵ1 )−1 , which is the boost experienced for its maximum maximizes exponent θ1 , which then evaluates to θ1,max , to the smaller factor (1 − ϵ2 )−1 . θ1 = θ1,max := η1 (R1 /(1 − ϵ1 )) , (39) A. Numerical Simulations but it degrades θ2 to In this section, we illustrate the benefits of exploiting the θ2 = θ2,deg := η1 (R1 /(1 − ϵ1 ))+η2 (R2 /(1 − ϵ2 )) < θ2,max . relaxed expected-rate constraints in (4) and (6) compared to the more stringent maximum-rate constraints (13) at hand (40) of some examples. We also show for ϵ1 < ϵ2 the benefits Varying the rate R{1,2},1 between the choices in (36) and (38), of “Rate-sharing” on the first link and the corresponding (and accordingly varying also rate R{1},1 to meet (32c)) tradeoff, where the rate R1 is split into (ϵ2 − ϵ1 )R{1},1 achieves the entire Pareto-optimal boundary of the fundamen- and (1 − ϵ2 )R{1,2},1 as in (32), instead of restricting to a tal exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ). single rate choice for the communication on the first link Remark 4 (Discussion for ϵ1 > ϵ2 ): For ϵ1 > ϵ2 the choice R{1,2},1 = R1 /(1 − ϵ1 ). For the case ϵ1 < ϵ2 , “Rate-sharing” on the second link does not have any added value. However, for the case ϵ1 > ϵ2 , we illustrate the benefits of “Rate- R{1,2},1 = R1 /(1 − ϵ1 ) (41a) sharing” on both links and the resulting tradeoff from varying R{2},1 = 0 (41b) the choices of the rates R{1,2},1 , R{2},1 , R{1,2},2 and R{2},2 that satisfy (33). This tradeoff stems from multiplexing three maximizes exponent θ1 , which then evaluates to coding subschemes among which we have two full versions θ1 = θ1,max (42) of the basic two-hop scheme and one degraded subscheme as explained in Subsection IV-C. and degrades θ2 to Throughout this section we consider the following example. ′ θ2 = θ2,deg := min η1 (R1 /(1−ϵ1 ))+η2 (R{1,2},2 ), Example 1: Let Y0 , S, T be independent Bernoulli random η2 (R{2},2 ) , (43) variables of parameters pY0 = 0.4, pS = 0.8, pT = 0.8 and set Y1 = Y0 ⊕ T and Y2 = Y1 ⊕ S. for R{2},2 and R{1,2},2 satisfying (33d). (Notice again that for We first consider the case ϵ1 = 0.05 < ϵ2 = 0.15, large values of R1 /(1 − ϵ1 ) the optimizer in (41) might not and plot the optimal exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ) in be unique and other optimizers might lead to a larger value of Figure 2 for symmetric rates R1 = R2 = 0.5. We note a θ2 .) tradeoff between the type-II error exponents θ1 and θ2 , which On the other hand, the choice is not present neither for the case ϵ1 = ϵ2 , nor for the same setup under maximum-rate constraints. (This tradeoff occurs R{2},1 = R{1,2},1 = R1 /(1 − ϵ2 ) (44a) because both exponents have to be optimized over the same R{2},2 = R{1,2},2 = R2 /(1 − ϵ2 ) (44b) choices of rates R{1},1 , R{1,2},1 .) The figure also shows a maximizes exponent θ2 , which then evaluates to θ2 = θ2,max , sub-optimal version of the exponents region in Theorem 2, but it degrades θ1 to where we set R{1},1 = R{1,2},1 = R1 /(1−ϵ1 ) and thus obtain ∗ Emax (R1 /(1−ϵ1 ), R2 /(1−ϵ2 )). Comparing these two regions, θ1 = θ1,deg := η1 (R1 /(1 − ϵ2 )) (45) we observe that using two different rates R{1},1 and R{1,2},1 (i.e., two different versions of the basic two-hop scheme) Varying the rate R{1,2},1 between the choices in (41) and (44) allows to obtain a better tradeoff between the two exponents. (and varying the rates R{1},1 , R{1,2},2 , R{1},2 accordingly), For futher comparison, Figure 2 also shows the exponents achieves the entire Pareto-optimal boundary of the fundamen- ∗ region Emax (R1 , R2 ) under maximum-rate constraints, so as to tal exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ). illustrate the gain provided by having the relaxed expected-rate Remark 5 (Rate-boosts when ϵ1 ̸= ϵ2 ): Notice that in our constraints instead of maximum-rate constraints. two-hop system with expected-rate constraints, exponents We then consider the case ϵ1 = 0.15 > ϵ2 = 0.05. Here we θ1,max and θ2,max defined in (39) and (37), are the largest consider three sub-cases for the rates: symmetric rates R1 = possible exponents achievable at the two decision centers, R2 = 0.5 or asymmetric rates R1 = 0.75 > R2 = 0.25 or irrespective of the ordering of ϵ1 and ϵ2 . By Theorem 3, R1 = 0.25 > R2 = 0.75. they coincide with the optimal exponents under maximum-rate In Figure 3 we plot the optimal exponents region constraints R1 /(1 − ϵ1 ) and R2 /(1 − ϵ1 ) for the two links in E ∗ (R1 , R2 , ϵ1 , ϵ2 ) in Theorem 2 for the first sub-case R1 = case of (39), and maximum-rate constraints R1 /(1 − ϵ2 ) and R2 = 0.5, and we compare it with the exponents region under R2 /(1 − ϵ2 ) in case of (37). We thus observe that whenever ∗ maximum-rate constraints Emax (R1 , R2 ) and with sub-optimal ϵ1 ̸= ϵ2 , the rate-boosts that expected-rate constraints allow to obtain over maximum-rate constraints depend on the permissi- versions of Theorem 3 where we either set R{1,2},1 = ∗ R1 R2 ble type-I error probabilities and also on the tradeoff between R{2},1 , for which we obtain Emax (1−ϵ2 ) , (1−ϵ2 ) , or we set 4262 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 Fig. 2. Exponents regions for Example 1 when ϵ1 = 0.05 < ϵ2 = 0.15 and Fig. 3. Exponents regions under expected- and maximum-rate constraints R1 = R2 = 0.5. for Example 1 when ϵ1 = 0.15 > ϵ2 = 0.05 and R1 = R2 = 0.5. R{1,2},2 = R{2},2 , for which we have a tradeoff between the type-II error exponents due to rate-sharing on the first link. Comparing all these regions, we see that rate-sharing on the first link allows to obtain a smooth tradeoff between the exponents, while rate-sharing on both links (i.e., having two full versions of the basic two-hop scheme) yields an even improved tradeoff. Figure 4 compares the exponents regions under expected- rate constraints for all three sub-cases. Clearly, θ1 is increasing in R1 , but θ2 is not necessarily increasing in R2 , since it also depends on R1 . In fact, exponents region E ∗ (0.25, 0.75, ϵ1 , ϵ2 ) is completely included in exponents region E ∗ (0.5, 0.5, ϵ1 , ϵ2 ). To understand these phenomena, notice that the maxi- mum achievable exponents on each communication link are Fig. 4. Exponents regions under symmetric and asymmetric expected-rate η1∗ (R1 ) = I(Y0 ; Y1 ) = 0.26766 and η2∗ (R2 ) = I(Y1 ; Y2 ) = constraints for Example 1 and when ϵ1 = 0.15 > ϵ2 = 0.05 and R = 1. 0.27433. Recall also that the θ2 -error exponent is an accu- mulation of the error exponents given by both functions each of the four subsets and thus combined by means of η1 (·) + η2 (·). The similar behaviours of the two functions the law of total expectation to obtain a bound on the total η1 (r) ≈ η2 (r) (r ∈ [0, 1]), together with the concavity and expected message length. The derived upper bounds on the monotonicity of these functions, induce that to obtain the error exponents are not conditioned on the subsets but hold in largest θ2 values in this example, the total rate should be an unconditional sense. The final bound on the type-II error distributed almost equally between both links. In contrast, exponent is then simply obtained by considering the tightest since the θ1 -error exponent depends only on rate R1 , the of the derived bounds, which explains the minimizations in largest value is achieved by putting all available rate to R1 . Theorem 2. All of the above explains the superiority of the error exponent To match the so-obtained bounds on the message lengths region obtained when R1 = R2 = 0.5 over the one obtained and the type-II error exponents with our achievability result when R1 = 0.25, R2 = 0.75, and the tradeoff between the (see the scheme in Section IV) two final steps are needed: exponents regions for the sub-cases R1 = R2 = 0.5 and 1) Show that the introduced auxiliary random variables satisfy R1 = 0.75, R2 = 0.25. certain Markov chains in the asymptotic regime of infinite blocklengths;2 and 2) Show that depending on the ordering VI. C ONVERSE P ROOF TO T HEOREM 2 ϵ1 = ϵ2 , ϵ1 < ϵ2 , or ϵ1 > ϵ2 , there exists an optimal A. Outline of the Converse Proof scheme that under H = 0 assigns asymptotically 0 probability to specific subsets of the introduced partition. For example, The main idea of the proof is to divide the set of strongly (n) if ϵ1 ≤ ϵ2 the probability under H = 0 that R2 decides on typical sequences Tµn (PY0 Y1 Y2 ) into four subsets according H = 0 but R1 does not should vanish, and if ϵ1 ≥ ϵ2 the to the decisions taken at the two DCs, see Figure 5. Parallel probability that R1 decides on H = 0 but R2 does not should change of measure arguments are then applied to each subset vanish. and different lower bounds on the expected message lengths and upper bounds on the error exponents are derived. The 2 In the weak converse proofs that do not have to apply a change of measure (implicit) lower bounds on the expected message lengths are argument these Markov chains are naturally true for all blocklengths because actually lower bounds on the conditional expectations given of the i.i.d.-ness of the source sequences. HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4263 Moreover, if the decision center Rk decides on the null hypothesis Ĥk = 0 for all tuples (y0n , y1n , y2n ) ∈ D, then k 1 X k+1 − log βk,n ≤ I(Uℓ ; Ỹℓ )− log PY0n Y1n Y2n (D). (53) n n ℓ=1 Proof: The lemma is essentially a special case of Lemma 4 ahead, which is proved in Appendix VIII-C. The slightly stronger statement in (51) can easily be verified at hand of the proof in Appendix VIII-C. Remark 6 (Discussion of Lemma 2): Inequalities (50) and (53) provide bounds on the entropy of a message and on the error exponent in case the set D is well chosen. (n) Fig. 5. Sketch of partitioning Tµn (PY0 Y1 Y2 ) and applying parallel change Inequalities (51) and (52) state that when the probability of measure arguments in the converse proof to Theorem 2. PY0n Y1n Y2n (D) does not vanish exponentially fast in n and the set D is a subset of the typical set (in which case the divergence D(PỸ0 Ỹ1 Ỹ2 ∥PY0 Y1 Y2 ) is arbitrarily small), then Technically, the main part of the proof is contained in the for large blocklengths the pmfs PUk Ỹk Ỹk−1 are close to the subsequent Proposition 3, which states the converse bound product pmfs PUk |Ỹk PỸk−1 Ỹk indicating the Markov chain before the mentioned simplification depending on the ordering Ỹk−1 → Ỹk → Uk . Equivalent Markov chains show up in of ϵ1 and ϵ2 , which is proved in Appendix D. The proof the achievability result and thus have to be recovered for the of Proposition 3 applies Lemma 2 ahead to each of the converse to be tight. four subsets introduced in Figure 5 to obtain bounds on the With this lemma, we can prove the desired general outer expected message length and the error exponents, and to prove bound on the exponents region. the desired Markov chains asymptotically. Proposition 3: Given R1 , R2 , ϵ1 , ϵ2 , ≥ 0, the fundamental exponents region E ∗ (R1 , R2 , ϵ1 , ϵ2 ) is included in the set of B. The Converse Proof all (θ1 , θ2 ) pairs satisfying Consider a sequence (in n) of encoding and decision func- θ1 ≤ min{η1 (R{1},1 ), η1 (R{1,2},1 )}, (54a) (n) (n) (n) (n) tions {(ϕ1 , ϕ2 , g1 , g2 )} satisfying the constraints on the θ2 ≤ min η1 (R{1,2},1 ) + η2 (R{1,2},2 ), rates and error probabilities in (12). The converse proof is η1 (R{2},1 ) + η2 (R{2},2 ) , (54b) based on the following lemma. Lemma 2: Consider an arbitrary blocklength n and a subset for rates R{1},1 , R{1,2},1 , R{1,2},2 , R{2},1 , R{2},2 ≥ 0 and D ⊆ Y0n × Y1n × Y2n . Let the tuple (M̃1 , M̃2 , Ỹ0n , Ỹ1n , Ỹ2n ) numbers σ{1} , σ{2} , σ{1,2} ≥ 0 so that σ{1} + σ{2} + σ{1,2} ≤ follow the pmf 1 and PM̃1 M̃2 Ỹ n Ỹ n Ỹ n (m1 , m2 , y0n , y1n , y2n ) ≜ σ{1} + σ{1,2} ≥ 1 − ϵ1 (54c) 0 1 2 1{(y0n , y1n , y2n ) ∈ D} σ{2} + σ{1,2} ≥ 1 − ϵ2 (54d) PY0n Y1n Y2n (y0n , y1n , y2n )· PY0n Y1n Y2n (D) σ{1,2} ≥ max{1 − ϵ1 − ϵ2 , 0}, (54e) ·1 1 (n) (n) n {ϕ1 (y0n ) = m1 }· {ϕ2 (y1 , ϕ1 (y0n )) = m2 }. (46) and so that the following rate constraints are satisfied: Further, define R1 ≥ σ{1} R{1},1 + σ{1,2} R{1,2},1 + σ{2} R{2},1 , (54f) R2 ≥ σ{1,2} R{1,2},2 + σ{2} R{2},2 . (54g) U1 ≜ (M̃1 , Ỹ0T −1 , Ỹ1T −1 , Ỹ2T −1 , T ) (47) U2 ≜ (M̃2 , Ỹ0T −1 , Ỹ1T −1 , Ỹ2T −1 , T ) (48) In Appendix D we show that the fundamental exponents region presented in this proposition is contained in the simpler Ỹk ≜ Ỹk,T , k ∈ {0, 1, 2}, (49) region in Theorem 2, which concludes the proof. (Actually the where T is uniform over {1, . . . , n} and independent of all region in Proposition 3 coincides with the simpler region of previously defined random variables. Theorem 2 but coincides with it. The proof of the reverse The following (in)equalities hold: direction, i.e., inclusion of the region in Theorem 2 in the region of Proposition 3, is however trivial because Proposi- 1 1 tion 3 is a converse result and the region in Theorem 2 is H(M̃k )≥ I(Uk ; Ỹk−1 ) + log PY0n Y1n Y2n (D), n n achievable.) k ∈ {1, 2}, (50) Proof: [Proof of Proposition 3] Choose µn as a sequence 1 satisfying I(U1 ; Ỹ1 |Ỹ0 )≤ − log PY0n Y1n Y2n (D), (51) n 2 lim µn = 0 (55) I(U2 ; Ỹ2 |Ỹ1 )≤ − log PY0n Y1n Y2n (D) + D(PỸ0 Ỹ1 Ỹ2 ∥PY0 Y1 Y2 ). n→∞ n lim µ2n n = ∞ (56) (52) n→∞ 4264 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 and define for each blocklenth n the sets (in)equalities B1,n ≜ {(y0n , y1n , y2n ) ∈ Tµ(n) (PY0 Y1 Y2 ) : H(M̃I,1 ) ≥ nI(UI,1 ; ỸI,0 )+log ∆I,n , (69) n (n) (n) H(M̃I,2 ) ≥ nI(UI,2 ; ỸI,1 )+log ∆I,n , (70) g1 (y1n , ϕ0 (y0n )) = 0}, (57) 1 B2,n ≜ {(y0n , y1n , y2n ) ∈ Tµ(n) (PY0 Y1 Y2 ) : − log ∆I,n ≥ I(UI,1 ; ỸI,1 |ỸI,0 ), (71) n n (n) (n) (n) 2 g2 (y2n , ϕ1 (y1n , ϕ0 (y0n ))) = 0}, (58) − log ∆I,n +D(PỸ0 Ỹ1 Ỹ2 ∥PY0 Y1 Y2 ) D{1,2},n ≜ B1,n ∩ B2,n , (59) n ≥ I(UI,2 ; ỸI,2 |ỸI,1 ) (72) D{1},n ≜ B1,n \D{1,2},n , (60) D{2},n ≜ B2,n \D{1,2},n . (61) and 1 2 − log β1,n ≤ I(UI,1 ; ỸI,1 ) − log ∆I , Define also the set n n if I ∈ {{1}, {1, 2}}, (73) D∅,n = Tµ(n) (PY0 Y1 Y2 )\(D{1},n ∪ D{2},n ∪ D{1,2},n ). (62) 1 3 n − log β2,n ≤ I(UI,1 ; ỸI,1 ) + I(UI,2 ; ỸI,2 ) − log ∆I , n n Notice that the sets D∅,n , D{1},n , D{2},n , D{1,2},n partition the if I ∈ {{2}, {1, 2}}. (74) (n) strongly typical set Tµn (PY0 Y1 Y2 ) and in each set DI,n only Define the following random variables the terminals Rk with k ∈ I declare Ĥk = 0, while terminals Rk with k ∈ / I declare Ĥk = 1. L̃I,j ≜ len(M̃I,j ), j ∈ {1, 2}, I ∈ P(2). (75) Define the probabilities By the rate constraints (4) and (6), and the definition of the random variables L̃I,j , we obtain by the total law of ∆I,n ≜ PY0n Y1n Y2n (DI,n ), I ∈ P(2), (63) expectations: and notice that by the laws of probability nR1 ≥ E[L1 ] (76) X ∆{1,2} + ∆{1} = PY0n Y1n Y2n (B1 ) (64) ≥ E[L̃I,1 ]∆I,n . (77) I∈P(2) ∆{1,2} + ∆{2} = PY0n Y1n Y2n (B2 ) (65) Moreover, ∆{1,2} ≥ PY0n Y1n Y2n (B1 )+PY0n Y1n Y2n (B2 )−1. (66) H(M̃I,1 ) = H(M̃I,1 , L̃I,1 ) (78) X∞ Notice further that by the type-I error probability con- = Pr[L̃I,1 = lI ]H(M̃I,1 |L̃I,1 = lI ) straints (12b) and by [42, Remark to Lemma 2.12] and basic lI =1 laws of probability: + H(L̃I,1 ) (79) ∞ |Y0 ||Y1 ||Y2 | X PY0n Y1n Y2n (Bk ) ≥ 1 − ϵk − , k ∈ {1, 2}, (67) ≤ Pr[L̃I,1 = lI ]lI + H(L̃I,1 ) (80) 4µ2n n lI =1 so that we conclude that: = E[L̃I,1 ] + H(L̃I,1 ), (81) which combined with (69) and (77) and the nonnegativity of lim (∆{1,2},n + ∆{1},n ) ≥ 1−ϵ1 (68a) entropy establishes n→∞ + lim (∆{1,2},n +∆{2},n ) ≥ 1−ϵ2 (68b) X 1 n→∞ ∆I,n I(UI,1 ; ỸI,0 ) + log ∆I,n lim ∆{1,2},n ≥ max{1 − ϵ1 −ϵ2 , 0} (68c) n I∈P(2) n→∞ X X 1 1 lim ∆I,n ≤ 1, (68d) ≤ ∆I,n E[L̃I,1 ]+ ∆I,n H(L̃I,1 ) n→∞ n n I∈P(2) I∈P(2) (82) where (68d) holds (for any blocklength) by the basic laws of probability because the sets DI,n are disjoint. X ∆I,n ≤ R1 1+ hb , (83) To simplify exposition, we assume that for any blocklength nR1 I∈P(2) n the probabilities ∆I,n > 0 for all sets I ∈ P(2). (Other- wise a corresponding subsequence of blocklengths should be where (83) holds by (77) and because the entropy of the considered if it exists. And if no such subsequence exists, meanE[L̃I,1 ] ≤ discrete and positive random variable L̃I,1 of nR1 nR1 ∆I,n the corresponding sets should be discarded. Details of the ∆I,n is bounded by H(L̃I,1 ) ≤ ∆I,n · hb nR1 , which is proofs in these other cases are omitted for brevity.) Applying easily seen by the following two facts: 1) The KL-divergence Lemma 2 to the four subsets allows to conclude that for D(P ∥Q) between an arbitrary pmf P (x) over the positive any I ∈ P(2) there exists a pair (UI,1 , UI,2 ) satisfying the integers of mean s and the geometric distribution of mean s HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4265 equals −HP (X) + HQ (X) and is nonnegative; and 2) The for some nonnegative numbers {σI }I∈P satisfying entropy of a geometric distribution of mean s is HQ (X) = X s · hb ( 1s ). σI ≤ 1 (90e) I∈P In a similar way, we obtain X + σ I ≥ 1 − ϵ1 , (90f) X 1 I∈P : 1∈I ∆I,n I(UI,1 ; ỸI,0 ) + log ∆I,n n σ{1,2} ≥ max{1 − ϵ1 − ϵ2 , 0}, (90g) I∈ {{1,2},{2}} X σ I ≥ 1 − ϵ2 . (90h) I∈P : 2∈I X ∆I ≤ nR2 1 + hb . (84) Notice further that since for any I ∈ P(2) and any R2 I∈ (n ) (n ) (n ) {{1,2},{2}} i the triple (ỸI,0i , ỸI,1i , ỸI,2i ) lies in the jointly typi- (ni ) The proof then follows from (71)–(74) and from (83) cal set Tµni (PY0 Y1 Y2 ) we have |PỸI,0 ỸI,1 ỸI,2 (y0 , y1 , y2 ) − and (84) and by letting n → ∞. Details are as follows. PY0 Y1 Y2 (y0 , y1 , y2 )| ≤ µni and since and µni → 0 as By Carathéodory’s theorem, we can replace above auxiliary i → ∞, the limiting pmfs satisfy PY∗I,0 YI,1 YI,2 = PY0 Y1 Y2 . random variables {(UI,1 , UI,2 )}I , which are of increasing Finally, by continuity considerations, by (89), and by (71) alphabet sizes, by random variables over constant alphabets the following Markov chains must hold for all I ∈ P under UI,1 and UI,2 of sizes PY∗I,0 YI,1 YI,2 UI,1 UI,2 : |UI,1 | ≤ |Y0 | · |Y1 | + 2, I ∈ P(2), (85) UI,1 → YI,0 → YI,1 , (91) |UI,2 | ≤ |UI,1 | · |Y0 | · |Y1 | + 1, I ∈ {{1, 2}, {2}}, (86) UI,2 → YI,1 → YI,2 . (92) while still preserving inequalities (71)–(74), (83), and (84). Notice that at this point and when P ⊊ P(2), one can (n) Let PỸ Ỹ Ỹ U U denote the pmf of these new tuples trivially add dummy random variables (namely UI,j = YI,j−1 I,0 I,1 I,2 I,1 I,2 (ỸI,0 , ỸI,1 , ỸI,2 , UI,1 , UI,2 ) at blocklength n, and abbreviate combined with σI = 0 for all I ∈ P(2)\P) to extend it as PI∗ . We invoke the Bolzano-Weierstrass theorem and conditions (90) to all subsets I ∈ P(2). The desired result consider an increasing subsequence of positive blocklengths in Proposition 3 is then obtained by plugging the definitions {ni }∞ of the functions η1 (·) and η2 (·) into above expressions (90) i=1 such that for all I ∈ P(2) the subsequences of (n ) and by relaxing some constraints (because any element is no probabilities ∆I,ni and pmfs PỸ i Ỹ Ỹ U U converge. I,0 I,1 I,2 I,1 I,2 larger than the maximum). Define the convergence points as: σI ≜ lim ∆I,ni , I ∈ P(2), (87) VII. A S YSTEM W ITH K-H OPS i→∞ (n) We generalize our setup and results to K hops, i.e., to K − 1 PỸ∗I,0 ỸI,1 ỸI,2 UI,1 UI,2 ≜ lim PỸ . (88) i→∞ I,0 ỸI,1 ỸI,2 UI,1 UI,2 relays. Notice next that the right-hand sides of (83) and (84) tend to R1 and R2 , respectively as n → ∞. Moreover, the terms A. System Model 1 n ∆I,n log ∆I,n on the left-hand sides of these inequalities Consider a system with a transmitter T0 observing the vanish as n → ∞ because t 7→ t log t is bounded over the source sequence Y0n , K − 1 relays labeled R1 , . . . , RK−1 interval [0, 1]. Restrict then attention to subsets I ∈ P(2) for and observing sequences Y1n , . . . , YK−1 n , respectively, and a which σI > 0, which implies that receiver RK observing sequence YK . n 1 The source sequences (Y0n , Y1n , . . . , YKn ) are distributed lim log ∆I,n = 0, (89) n→∞ n according to one of two distributions depending on a binary We denote the set of these subsets I satisfying (89) by P ⊆ hypothesis H ∈ {0, 1}: P(2), and also notice that inequalities (83) and (84) remain if H = 0 : (Y0n , Y1n , . . . , YKn ) i.i.d. ∼ PY0 Y1 · PY2 |Y1 · · · valid if summation is only over sets in P because all terms are nonnegative. For subsets in P the last terms in (73) and (74) PYK |YK−1 ; (93a) vanish in the asymptotic regime ni → ∞. if H = 1 : (Y0n , Y1n , . . . , YKn ) i.i.d. ∼ PY0 · PY1 · · · PYK . By all these considerations, by (68), (71)–(74), (83), (93b) and (84), in the limit i → ∞: X Communication takes place over K hops as illustrated R1 ≥ σI · IPI∗ (UI,1 ; YI,0 ) (90a) in Figure 6. The transmitter T0 sends a message M1 = (n) I∈P X ϕ0 (Y0n ) to the first relay R1 , which sends a message (n) R2 ≥ σI · IPI∗ (UI,2 ; YI,1 ) (90b) M2 = ϕ1 (Y1n , M1 ) to the second relay and so on. The I∈P : 2∈I communication is thus described by encoding functions θ1 ≤ min IPI∗ (UI,1 ; YI,1 ) (90c) (n) I∈P ϕ0 : Y0n → {0, 1}⋆ (94) θ2 ≤ ! min IPI∗ (UI,1 ; YI,1 ) + IPI∗ (UI,2 ; YI,2 ) . (n) ϕk : Ykn ×{0, 1}⋆ → {0, 1}⋆ , k ∈ {1, . . . , K −1}, I∈P (90d) (95) 4266 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 maximum-rate setup is defined analogously to Definition 4, but with (98) replaced by (104). Definition 5: For any ℓ ∈ {1, . . . , K}, define the function η ℓ : R+ + 0 → R0 (105) Fig. 6. Cascaded K-hop setup with K decision centers. R 7→ max I (U ; Yℓ ) . (106) PU |Yℓ−1 : R≥I(U ;Yℓ−1 ) so that the produced message strings The functions η1 , . . . , ηK are concave and monotonically (n) M1 = ϕ0 (Y0n ) (96) non-decreasing. The proof is analogous to the proof of (n) Mk+1 = ϕk (Ykn , Mk ), k ∈ {1, . . . , K −1}, (97) Lemma 1 presented in Appendix A, and omitted for brevity. Notice further that in the maximization determining ηℓ (R) it satisfy the expected-rate constraints suffices to consider distributions PU |Yℓ−1 on alphabets of sizes |Yℓ−1 | + 1, see [3]. E [len (Mk )] ≤ nRk , k ∈ {1, . . . , K}. (98) Theorem 4 [41]: Given (ϵ1 , . . . , ϵK ) ∈ [0, 1)K , the fun- Each relay R1 , . . . , RK−1 as well as the receiver RK , damental exponents region under the maximum-rate con- produces a guess of the hypothesis H. These guesses are straints (104) and vanishing type-I error constraints satisfies described by guessing functions ∗ Emax (R1 , . . . , RK , ϵ1 , . . . , ϵK ) (n) gk : Ykn ⋆ × {0, 1} → {0, 1}, k ∈ {1, . . . , K}, (99) ( k ) X = (θ1 , . . . , θK ): θk ≤ ηℓ (Rℓ ), k ∈ {1, . . . , K} (107) where we request that the guesses ℓ=1 (n) Ĥk,n = gk (Ykn , Mk ), k ∈ {1, . . . , K}, (100) Notice that in this K-hop setup, each decision center accu- mulates all the error exponents on the various links from the have type-I error probabilities transmitter to this decision center. The fundamental exponents αk,n ≜ Pr[Ĥk = 1|H = 0], k ∈ {1, . . . , K}, (101) region is thus given by a K-dimensional hyperrectangle. That means, each decision center can simultaneously achieve its not exceeding given thresholds ϵ1 , ϵ2 , . . . , ϵK > 0, and type-II optimal error exponent as if the other decision centers were error probabilities not present in the system. ∗ βk,n ≜ Pr[Ĥk = 0|H = 1], k ∈ {1, . . . , K}, (102) We abbreviate Emax (R1 , . . . , RK , ϵ1 , . . . , ϵK ) by ∗ Emax (R1 , . . . , RK ). decaying to 0 exponentially fast with largest possible expo- nents. Definition 3: Given maximum type-I error probabilities C. Optimal Coding Scheme for K Hops Under ϵ1 , ϵ2 , . . . , ϵK ∈ [0, 1) and rates R1 , R2 , . . . , RK ≥ 0. Expected-Rate Constraints The exponent tuple (θ1 , θ2 , . . . , θK ) is called (ϵ1 , ϵ2 , . . . , ϵK )- Similarly to the two-hop scheme, the terminals multiplex achievable if there (n)exists a sequence of encoding and deci- different subschemes depending on the sequence Y0n observed (n) (n) (n) (n) (n) sion functions ϕ0 , ϕ1 , . . . , ϕK−1 , g1 , g2 , · · · gK n≥1 at the transmitter T0 . To this end, partition the set Y0n into satisfying for each k ∈ {1, . . . , K}: disjoint subsets D∅ and {DI }I∈P(K) so that the probabilities E[len(Mk )] ≤ nRk , (103a) σI := Pr[Y0n ∈ DI ] (108) lim αk,n ≤ ϵk , (103b) satisfy n→∞ 1 1 X X lim log ≥ θk . (103c) 1− ϵk ≤ σI , S ⊆ {1, . . . , K}, (109a) n→∞ n β k,n k∈S I∈P(K) : Definition 4: The fundamental exponents region S⊆I E ∗ (R1 , R2 , . . . , RK , ϵ1 , ϵ2 , . . . , ϵK ) is defined as the closure X σI ≤ 1. (109b) of the set of all (ϵ1 , ϵ2 , . . . , ϵK )-achievable exponent pairs I∈P(K) (θ1 , θ2 , . . . , θK ) for given rates R1 , . . . , RK ≥ 0. In our multiplexed schemes, the index I of DI indicates that if T0 ’s observation Y0n lies in DI , then all decision centers B. Previous Results Under Maximum-Rate Constraints Rk , for k ∈ I, attempt to correctly guess hypothesis H, while The K-hop hypothesis testing setup of Figure 6 and Equa- all decision centers Rk , for k ∈/ I, simply declare Ĥk = 1. tions (93) was also considered in [28], but under maximum- If Y0n ∈ D∅ , then all decision centers R1 , . . . , RK simply rate constraints: declare Ĥ = 1. The transmitter T0 adds K flag-bits to its message M1 to len(Mi ) ≤ nRi , i ∈ {1, . . . , K}, (104) inform R1 about the set DI containing its observation Y0n , instead of the expected-rate constraints (98). The fundamen- and thus about the choice of the employed coding scheme. ∗ tal exponents region Emax (R1 , . . . , RK , ϵ1 , . . . , ϵK ) for this These flag-bits are forwarded by all relays R1 , . . . , RK−1 at HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4267 the beginning of their messages M2 , . . . , MK so as to pass the constraints R1 , . . . , RK on the K links for large values of n. information to all terminals in the network. Appendix E proves that when the optimal multi-hop hypothesis We describe the different multiplexed coding schemes in testing schemes with vanishing type-I error probability [28] more detail. Let ℓ∗I be the largest index in set I: are used as the various subschemes, then the overall scheme satisfies the type-I error constraints ϵ1 , . . . , ϵK and achieves ℓ∗I := max k, (110) k∈I the error exponents in the following Theorem 5. and chooses a set of rates {RI,ℓ : I ∈ P(K), ℓ ∈ {1, . . . , ℓ∗I }} (111) D. Results on the Exponents Region satisfying Theorem 5: The fundamental exponents region E ∗ (R1 , . . . , X RK , ϵ1 , . . . , ϵK ) is equal to the set of all nonnegative tuples Rℓ > σI · RI,ℓ , ℓ ∈ {1, . . . , K}. (112) (θ1 , . . . , θK ) satisfying I∈P(K) : ℓ∗ I ≥ℓ k X θk ≤ min ηℓ (RI,ℓ ), (116a) We will see that the choice of the various rates determines I∈P(K) : k∈I ℓ=1 the tradeoff between the different exponents θ1 , . . . , θK . Rates {RIℓ : ℓ ∈ {1, . . . , ℓ∗I }} are used in the subscheme employed for some nonnegative rates {RI,1 , . . . , RI,ℓ∗I }I∈P(K) and when Y0n ∈ DI , where under this event only the messages nonnegative numbers {σI }I∈P(K) satisfying on the first ℓ∗I links have positive rates, while messages on the last K − Iℓ∗ links are of zero rate. The reason is that X Rk ≥ σI RI,k , k ∈ {1, . . . , K}, decision center RIℓ∗ +1 , . . . , RK simply declare Ĥ = 1 and thus I∈P(K) : messages Mℓ∗I +1 , . . . , MK only have to convey the zero-rate k≤ℓ∗ I information that Y0n ∈ DI . (116b) Subscheme for Y0n ∈ D∅ : All terminals T0 and R1 , . . . , ( X ) X RK−1 send the length-K all-zero bit string over the respective max 0, 1− ϵk ≤ σI , S ⊆ {1, . . . , K}, communication links: k∈S I∈P(K) : S⊆I M1 = · · · = MK = [0, 0, . . . , 0]. (113) (116c) Upon receiving this all-zero flag, relays R1 , . . . , RK−1 and X σI ≤ 1. (116d) receiver RK all declare I∈P(K) Ĥ1 = · · · = ĤK = 1. (114) Proof: Achievability is based on the coding scheme pre- Communication is thus only used to inform the relays and the sented in the previous subsection and analyzed in Appendix E. receiver about the scheme to employ, or equivalently the event The converse is proved in the next Section VIII. Y0n ∈ D∅ , without providing any further information about the We conjecture that similarly to the case of K = 2 hops, correct hypothesis. the optimal exponents region E ∗ (R1 , . . . , RK , ϵ1 , . . . , ϵK ) in Subscheme for Y0n ∈ DI , for I ∈ P(K): In this case, only Theorem 5 can be simplified depending on the ordering of decision centers Rk , for k ∈ I, attempt to correctly guess the ϵ-values. To state our conjecture, we define a permutation hypothesis H; all other decision centers Rk , for k ∈ / I, directly π : {1, . . . , K} → {1, . . . , K} that orders the ϵ-values in declare Ĥk = 1. decreasing order: Terminals T0 , R1 , . . . , Rℓ∗I apply a given ℓ∗I -hop hypothesis testing scheme with vanishing type-I error probabilities and ϵπ(1) ≥ ϵπ(2) ≥ · · · ≥ ϵπ(K) , (117) respecting the maximum-rate constraints RI,1 , . . . , RI,ℓ∗I on and sets ϵπ(0) := 1. We conjecture then that in Theorem 5 the first ℓ∗I links. To inform all relays and the receiver about without loss of optimality one can set the scheme to use, terminals T0 , R1 , . . . ,RK−1 append a K- length flag sequence describing set I at the beginning of their σ{π(i),···,π(K)} = ϵπ(i−1) − ϵπ(i) , i ∈ {1, . . . , K}, (118) messages. We propose that this flag sequence shows bit 1 at all positions k ∈ I and bit 0 at all positions k ∈ / I. Notice that and all other σ-values to 0. Renaming rates R{π(i),···,π(K)},ℓ Messages Mℓ∗I +1 , . . . , MK consist of only the flag sequence. to Ri,ℓ and ℓ∗{π(i),···,π(K)} to ℓ∗i , with the choice in (118) All decision centers Rk with k ∈ I declare the hypoth- Theorem 5 evaluates to the following region: esis indicated by the employed multi-hop hypothesis testing Conjecture 6: The fundamental exponents region scheme. The remaining decision centers Rk with k ∈ / I simply E ∗ (R1 , . . . , RK , ϵ1 , . . . , ϵK ) is the set of all exponent declare tuples (θ1 , . . . , , θK ) that satisfy Ĥk = 1, k ∈ / I. (115) " k # X Analysis: By (108) and (112), and because transmission of θk ≤ min ηℓ (Ri,ℓ ) , k ∈ {1, . . . , K}, i∈{1,...,π(k)} K bits hardly changes the rate for sufficiently large block- ℓ=1 lengths, the proposed overall scheme respects the expected-rate (119a) 4268 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 for some nonnegative rates {Ri,ℓ } satisfying so that for this exponent all rates are boosted by (1 − ϵk )−1 . X In fact, θk = θk,max is achieved by choosing the first k rates Rℓ ≥ ϵπ(i−1) −ϵπ(i) Ri,ℓ , ℓ ∈ {1, . . . , K}, as:3 i∈{1,...,K} : ℓ∗ i ≥ℓ Rℓ RI,ℓ = , k ∈ I, ℓ ∈ {1, . . . , k}. (124) (119b) 1 − ϵk This choice imposes that σI RI,ℓ = 0 for all I not where containing k and all ℓ ∈ {1, . . . , k}. As a consequence, the ℓ∗i := max{ℓ : ℓ ∈ {π(i), . . . , π(K)}}. (119c) optimal performance for a decision center Rk′ , for k ′ < k, is ℓ ′ k The importance of this conjecture lies in proving that the X Rℓ optimal coding scheme in the previous Subsection VII-C only θ k′ = ηℓ , if ϵk′ > ϵk (125) 1 − ϵk ℓ=1 has to multiplex K + 1 subschemes, where the choice of the subschemes depends on the permutation π that orders the ϵ- θk′ = 0, if ϵk′ < ϵk , (126) values. Moreover, no optimization over the probabilities of the where the performance in (125) is obtained by setting σI = various subschemes is required anymore, as the probability of 0 for all I containing an index k ′ < k with ϵ′k > ϵk and each subscheme is determined by the ϵ-values and the ordering by setting the corresponding rates to infinity. Notice that σI permutation π. More specifically, if the conjecture is correct, cannot be chosen equal to 0 for all sets I containing index k ′ < the optimal scheme multiplexes K + 1 schemes, where the k when ϵk′ < ϵk because Constraint (116c) implies that at least i-th scheme is applied with probability ϵπ(i−1) − ϵπ(i) and one of these σ-values is positive, which by σI RI,ℓ = 0 implies serves only the DCs with (K − i + 1)-th smallest type-I error that the corresponding rates RI,ℓ = 0, for all ℓ = 1, . . . , k, constraints, while all other DCs directly declare H = 1. causing θk′ to degrade to 0. We conclude that under (123), for While a general proof of the conjecture is still missing, any k ′ < k, when ϵ′k ≥ ϵk then exponent θk′ is degraded from we can prove it in two special cases. its maximum value because all rates are only boosted by the Proposition 7: Conjecture 6 holds when factor (1 − ϵk )−1 and not by the larger factor (1 − ϵk′ )−1 , and K ∈ {2, 3} (120) when ϵ′k < ϵk the exponent θk′ completely degrades to 0. With appropriate choices for the rates on the last (K − k) or when links, different tradeoffs between the exponents θk+1 , . . . , θK ϵ1 = · · · = ϵK . (121) can be achieved. In particular, it is possible that an exponent θk′ , for k ′ > k, experiences its maximum rate-boost (1 − In particular, for ϵ1 = . . . = ϵK = ϵ, the exponents region ϵk′ )−1 on some of these links. On the first k links, any E ∗ (R1 , . . . , RK , ϵ, . . . , ϵ) is the set of all nonnegative tuples exponent θk+1 , . . . , θK experiences a rate-boost of (1 − ϵk )−1 (θ1 , . . . , θK ) satisfying if the corresponding ϵk′ > ϵk , whereas the contributions of k the first k links completely degrade to 0 if ϵk′ < ϵk . X Rℓ The following lemma indicates that in the evaluation of the θk ≤ ηℓ , k ∈ {1, . . . , K}. (122a) 1−ϵ fundamental exponents region E ∗ (R1 , . . . , RK , ϵ1 , . . . , ϵK ) in ℓ=1 Theorem 5 one can restrict to sets of parameters {σI } that Proof: Achievability of the region in (119) for any value satisfy some of the constraints (116c) with equality and of K follows by specializing the region in Theorem 5 to the set certain σ-values to 0. It is a first step towards proving parameter choice in (118) and setting all other σ-values to 0, Conjecture 6 in the general case or at least towards simplifying and by renaming rates R{π(i),···,π(K)},ℓ as Ri,ℓ . For K = 2 the the evaluation of the exponents region in Theorem 5. result recovers Theorem 2. The converse for K = 3 is proved Lemma 3: Consider a set of nonnegative numbers in Appendix G. The converse for ϵ1 = · · · = ϵK is proved in {RI,1 , . . . , RI,ℓ∗I }I∈P(K) and {σI }I∈P(K) satisfying (116) Appendix H. for exponents (θ1 , . . . , θK ). Let I ′ , I ′′ ∈ P(K) and As we discuss next, similar observations apply for the Γ ∈ [0, σI ′′ ] be so that general case K ≥ 2 as we have presented for K = 2 in Remarks 2–5. I ′ ⊆ I ′′ (127) Remark: [Discussion for ϵ1 = · · · = ϵK ] For equal type-I error probabilities ϵ1 = · · · = ϵK = ϵ, there is a rate-boost and on each link of (1 − ϵ)−1 compared to the scenario with ( ) maximum-rate constraints. X X max 0, 1− ϵk +Γ ≤ σI , S ⊆ I ′′ , S ⊈ I ′ . Remark: [Discussion for the General Case] In the general k∈S I∈P(K) : case, irrespective of the ordering of the permissible type-I error S⊆I probabilities, the largest exponent achievable at a decision (128) center k is given by 3 This choice assumes that the ordering (117) is strict, i.e., no two ϵ- k values coincide. Moreover, when some of the available rates R1 , . . . , Rk are X Rℓ θk,max := ηℓ , (123) sufficiently large so as to saturate the functions ηℓ (Rℓ ), then other choices 1 − ϵk are possible. ℓ=1 HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4269 Then, the new nonnegative numbers changed measure on each set converges to a distribution whose (Y0 , . . . , YK )-marginal equals PY0 ···YK and satisfies desired σ̃I ′ = σI ′ + Γ (129) Markov properties with the auxiliary random variables defined σ̃I ′′ = σI ′′ − Γ (130) by Lemma 4. ′ ′′ σ̃I = σI , I ∈ P(K)\{I , I }, (131) B. Converse Proof and rates, for ℓ ∈ {1, . . . , K}, Fix an exponent-tuple (θ1 , . . . , θK ) in the σI ′ · RI ′ ,ℓ + Γ · RI ′′ ,ℓ R̃I ′ ,ℓ = , (132) exponents region E ∗ (R1 , . . . , RK , ϵ1 , . . . , ϵK ), and a σ̃I ′ sequence (in n) of encoding and decision functions R̃I,ℓ = RI,ℓ , I ∈ P(K)\{I ′ }. (133) (n) (n) (n) (n) (n) {(ϕ0 , ϕ1 , . . . , ϕK , g1 , . . . , gK )}n≥1 achieving this also satisfy (116) for exponents (θ1 , . . . , θK ). tuple, i.e., satisfying constraints (103). Proof: Above rate-definitions essentially only shift the Our proof relies on the following lemma: term Γ · RI ′′ ,ℓ from σI ′′ RI ′′ ,ℓ to σ̃I ′ RI ′ ,ℓ , and therefore the Lemma 4: Fix a blocklength n and a set D ⊆ Y0n × n n rate constraints (116b) remain valid also for the new numbers. Y1 × · · · × YK of positive probability, and let the tuple Similarly, constraint (116d) remains valid since the sum of (M̃1 , M̃2 , . . . , M̃K , Ỹ0n , Ỹ1n , . . . , ỸKn ) follow the pmf all σ-values is preserved. Notice further that the σ-values PM̃1 M̃2 ···M̃K Ỹ n Ỹ n ···Ỹ n (m1 , m2 , . . . , mK , y0n , y1n , . . . , yK n )≜ included in Constraint (116c) for S ⊈ I ′′ remain unchanged 0 1 K by (131) and for S ⊆ I ′ their sum is preserved by (129) 1{(y0n , y1n , . . . , yK n ) ∈ D} PY0n Y1n ···YKn (y0n , y1n , . . . , yK n ) · and (130). For S ⊈ I ′ but S ⊆ I ′′ , Constraint (116c) is PY0n Y1n ...YKn (D) satisfied by Assumption (128). It remains to check the validity ·1{ϕ1 (y0n ) = m1 } · 1{ϕ2 (y1n , ϕ1 (y0n )) = m2 } · · · · of (116a) for the new rate-values. By (133) the constraint ·1{ϕK (yK−1 n n , ϕK−1 (yK−2 , ϕK−2 (· · · , ϕ1 (y0n ))) = mK }. remains unchanged for all k ∈ / I ′ . For k ∈ I ′ , we notice (136) that by (127) the minimum in (116a) includes both sets I ′ and I ′′ and this minimum cannot be smaller for the new rates Further, define the auxiliary random variables because: Uk ≜ (M̃k , Ỹ0T −1 Ỹ1T −1 , . . . , ỸKT −1 , T ), ( k k ) X X k ∈ {1, . . . , K}, (137) min ηℓ (RI ′ ,ℓ ) , ηℓ (RI ′′ ,ℓ ) ℓ=1 ℓ=1 Ỹk ≜ Ỹk,T , k ∈ {0, 1, . . . , K}, (138) ( k X σI ′ Γ where T is uniform over {1, . . . , n} and independent of the ≤ min ηℓ (RI ′ ,ℓ ) + ηℓ (RI ′′ ,ℓ ) , σ̃I ′ σ̃I ′ tuple (M̃1 , M̃2 , . . . , M̃K , Ỹ0n , Ỹ1n , . . . , ỸKn ). ℓ=1 k ) For any k ∈ {1, . . . , K} the following (in)equalities hold: X ηℓ (RI ′′ ,ℓ ) (134) 1 1 H(M̃k ) ≥ I(Uk ; Ỹk−1 ) + log PY0n Y1n ...YKn (D), ℓ=1 n n (139) ( k k ) X X ≤ min ηℓ R̃I ′ ,ℓ , ηℓ R̃I ′′ ,ℓ , (135) 1 ℓ=1 ℓ=1 I(Uk ; Ỹk |Ỹk−1 ) ≤ − log PY0n Y1n ...YKn (D) n where the first inequality holds because the minimum of + D(PỸ0 ···ỸK ∥PY0 ···YK ). (140) two numbers cannot exceed any convex combination of the If further for some k ∈ {1, . . . , K}, decision center Rk numbers, and the second inequality holds by the concavity decides on the null hypothesis for all tuples (y0n , . . . , yK n )∈ and monotonicity of the functions {ηℓ (·)}ℓ . 4 D, VIII. C ONVERSE P ROOF TO T HEOREM 5 Ĥk = 0, (141) A. Proof Outline then The converse proof is similar as for K = 2 users. k X (k + 1) We partition the strongly typical set Tµ (PY0 ···YK ) into 2K βk,n ≤ I(Uℓ ; Ỹℓ ) + log PY0n Y1n ...YKn (D). (142) subsets according to the decisions taken at the various DCs. n ℓ=1 A change of measure argument is then applied in parallel to Proof: See Subsection VIII-C at the end of this section. each of these subsets and based on this change of measure, by applying Lemma 4 ahead, bounds on the conditionally We continue to prove Theorem 5. Let µn be a sequence expected message lengths and on the overall type-II error satisfying exponent are derived, as well as a bound on the proximity of the new measure with some Markov pmf. The desired converse lim µn = 0 (143) n→∞ proof can then be derived by combining the bounds for the lim µ2n n = ∞. (144) different subsets (either through the law of total expectation or n→∞ simply by considering the most stringent one) and by proving 4 Notice that once we fix the realizations of all observed sequences that there exists a subsequence of blocklengths for which the Y0n , . . . , YK n , the decision Ĥ is either determinstically 0 or 1. k 4270 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 For each blocklength n, define for each index k ∈ and, similarly to (84), we obtain with (151a) and the nonneg- {1, . . . , K} the set ativity of entropy: + Bk,n ≜ {(y0n , . . . , yK n ) ∈ Tµ(n) (PY0 ···YK ): Ĥk = 0} (145) X 1 n ∆I,n I(UI,k ; Ỹk−1 ) + log ∆I,n n I∈P(K) : and for each subset I ∈ P(K) the set ℓ∗ I ≥k DI,n ≜ n X ∆I,n (y0n , . . . , yK n ) ∈ Tµ(n) (PY0 ···YK ) : ≤ Rk 1 + hb . (154) n nRk I∈P(K) : Ĥk = 0 ∀k ∈ I and Ĥk = 1 ∀k ∈ / I}. (146) ℓ∗ I ≥k The desired converse can then be concluded based on Notice that the sets {DI,n }I are disjoint and (151b), (151c), and (154), in a similar way as we did for K = 2. Details are omitted. [ DI,n = Bk,n . (147) I∈P(K) : k∈I C. Proof of Lemma 4 Moreover, by [42, Remark to Lemma 2.12] and the type-I error Note first that by (136): probability constraints in (103b), for any k ∈ {1, . . . , K}: D(PỸ n ···Ỹ n ∥PYn0 ···YK ) =− log ∆ (155) 0 K |Y0 | · · · |YK | PY0n Y1n ···YKn (Bk,n ) ≥ 1 − ϵk − . (148) PỸ n (ykn ) ≤PYkn (ykn ), 4µ2n n k k ∈ {1, . . . , K}, ykn ∈ Ykn , (156) Defining where we defined ∆ ≜ PY0n ···YKn (D). Further define Ũk,t ≜ ∆I,n := PY0n Y1n ···YKn (DI,n ), (149) (M̃k , Ỹ0t−1 , . . . , ỸKt−1 ) for k ∈ {1, . . . , K}. we conclude by (147), by standard laws of probability, and the Proof of (139): disjointness of the sets {DI,n }I , that in the limit as n → ∞, Similarly to [34] we introduce the divergence for any subset S ⊆ {1, . . . , K}: D(PỸ n ···Ỹ n ∥PYn0 ···YK ), which is bounded by log ∆ as help in 0 K ( ) the single-letterization of H(M̃k ). For each i ∈ {1, . . . , K}, lim X ∆I,n ≥ max 1 − X ϵk , 0 . (150) we have: n→∞ I∈P(K) : k∈S H(M̃k ) ≥ I(M̃k ; Ỹ0n · · · ỸKn ) S⊆I + D(PỸ n ···Ỹ n ∥PYn0 ···YK ) + log ∆ (157) 0 K We now assume that for every blocklength n and every subset I ∈ P(K) we have ∆I,n > 0. (Otherwise we restrict = H(Ỹ0n · · · ỸKn ) + D(PỸ n ···Ỹ n ∥PYn0 ···YK ) 0 K to an appropriate subsequence of blocklengths or eliminate sets − H(Ỹ0n · · · ỸKn |M̃k ) + log ∆ (158) I ∈ P(K) alltogether.) By Lemma 2 we can then conclude ≥ n[H(Ỹ0,T · · · ỸK,T ) that for any I: for any k ∈ {1, . . . , K} the (in)equalities + D(PỸ0,T ···ỸK,T ∥PY0 ···YK )] 1 1 H(M̃I,k )≥ I(UI,k ; Ỹk−1 ) + log ∆I,n , (151a) n n n X 1 − H(Ỹ0,t · · · ỸK,t |Ũk,t ) + log ∆ (159) I(UI,k ; ỸI,k |ỸI,k−1 )= − log ∆I,n t=1 n + D(PỸ0 ···Ỹ ∥PY0 ···YK ) K (151b) = n[H(Ỹ0,T · · · ỸK,T ) n n + D(PỸ0,T ···ỸK,T ∥PY0 ···YK ) where {UI,1 , . . . , UI,ℓ∗I , M̃I,1 , M̃I,2 , . . . , M̃I,ℓ∗I , ỸI,0 , ỸI,1 , n . . . , ỸI,K } are defined in the lemma. Moreover, for indices −H(Ỹ0,T · · · ỸK,T |Ũk,T , T )] + log ∆ (160) k ∈ I: ≥ n[H(Ỹ0,T · · · ỸK,T ) 1 k X (k + 1) − H(Ỹ0,T · · · ỸK,T |Ũk,T , T )] + log ∆ − log βk,n ≤ I(UI,ℓ ; ỸI,ℓ ) + log ∆I,n . (151c) (161) n ℓ=1 n = n[I(Ỹ0 · · · ỸK ; Uk )] + log ∆n (162) We define the following random variables for I ∈ P(K) 1 and k ∈ {1, . . . , ℓ∗I }: ≥ n I(Ỹk−1 ; Uk ) + log ∆n . (163) n L̃I,k ≜ len(M̃I,k ). (152) Here, (157) holds by (155); (159) holds by the super-additivity property in [34, Proposition 1], by the chain rule, by the defi- By the rate constraints (98) and the total law of expectations: nition of Ũk,t and by defining T uniform over {1, . . . , n} inde- pendent of the previously defined random variables; and (162) X nRk ≥ E[L̃I,k ]∆I,n , (153) I∈P(K) : by the definitions of Uk , Ỹk , Ỹk−1 in the lemma. This proves ℓ∗ I ≥k Inequality (139) in the lemma. HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4271 Proof of (140): Consider an index k satisfying Condition (141), i.e., satisfying We start by noticing the Markov chain M̃1 → Ỹ0n → D ⊆ Ak . Obviously, (Ỹ1n , · · · , ỸKn ), and thus similar to the analysis in [43, PM̃k Ỹ n (Ak ) = 1. (179) Section V.C]: k Define for any k ∈ {1, . . . , K} the pmfs 0 = I(M̃k ; Ỹkn · · · ỸKn |Ỹ0n · · · Ỹk−1 n ) (164) QM̃k (mk ) ≥ H(Ỹkn · · · ỸKn |Ỹ0n · · · Ỹk−1 n ) X ≜ PỸ n (y0n ) · · · PỸ n (yk−1 n ) − H(Ỹkn · · · ỸKn |Ỹ0n · · · Ỹk−1 n M̃k ) y0n ,y1n ,...,yk−1 n 0 k−1 n + D(PỸ n ···Ỹ n ∥PY0 ···YK ) + log ∆ (165) ·1{mk = ϕk (ϕk−1 (· · · (ϕ1 (y0n ) · · · )), yk−1 n )}, (180) 0 K ≥ n[H(Ỹk,T · · · ỸK,T |Ỹ0,T · · · Ỹk−1,T ) and + D(PỸ0,T ···ỸK,T ∥PY0 ···YK )] + log ∆ QMk (mK ) − H(Ỹkn · · · ỸKn |Ỹ0n · · · Ỹk−1 n M̃k ) (166) X ≜ PY0n (y0n ) · · · PYk−1 n n (yk−1 ) ≥ n[H(Ỹk,T · · · ỸK,T |Ỹ0,T · · · Ỹk−1,T ) y0n ,y1n ,...,yk−1 n + D(PỸ0,T ···ỸK,T ∥PY0 ···YK )] + log ∆ ·1{mk = ϕk (ϕk−1 (· · · (ϕ1 (y0n ) · · · )), yk−1 n )}. (181) n and notice that by the definitions of PM̃k Ỹ n , QM̃k and X − H(Ỹk,t · · · ỸK,t |Ỹ0,t · · · Ỹk−1,t k t=1 by (156): Ỹ0t−1 · · · ỸKt−1 Ỹ0,t+1 n n · · · Ỹk−1,t+1 M̃k ) QM̃k PỸ n (Ak ) ≤ QMk PYkn (Ak ) ∆−(k+1) . (182) k (167) By (179) and (182) and the definition of βk,n we have: ≥ nH(Ỹk,T · · · ỸK,T |Ỹ0,T · · · Ỹk−1,T ) + log ∆ 1 −nH(Ỹk,T · · · ỸK,T |Ỹ0,T · · · Ỹk−1,T − log βk,n n Ỹ0T −1 · · · ỸKT −1 Ỹ0,T n n +1 · · · Ỹk−1,T +1 M̃k T ) 1 = − log QMk PYnk (Ak ) (183) (168) n 1 (k+1) ≥ nI(Ỹk,T · · ·ỸK,T ; Ỹ0T −1 · · ·ỸKT −1 M̃k T |Ỹ0,T · · ·Ỹk−1,T ) = − log QM̃k PỸ n (Ak ) − log ∆ (184) n k n + log ∆n (169) 1 (k+1) = D PM̃k Ỹ n (Ak ) ∥ QM̃k PỸ n (Ak ) − log ∆ = nI(Ỹk · · · ỸK ; Uk |Ỹ0 · · · Ỹk−1 ) + log ∆ (170) n k k n ≥ nI(Ỹk ; Uk |Ỹ0 · · · Ỹk−1 ) + log ∆ (171) (185) 1 (k + 1) ≥ nI(Ỹk ; Uk |Ỹk−1 ) − nI(Ỹk ; Ỹ0 · · · Ỹk−2 |Ỹk−1 ) ≤ D(PM̃k Ỹ n ∥QM̃k PỸ n ) − log ∆, (186) n k k n + log ∆. (172) where the inequality holds by the data processing inequality for KL-divergence. where (166) holds by the super-additivity property in We continue to upper bound the divergence term as [34, Proposition 1]; (167) by the chain rule; (168) by the nonnegativity of the Kullback-Leibler divergence. D(PM̃k Ỹ n ∥QM̃k PỸ n ) k k The desired inequality (140) is then obtained by combin- = I(M̃k ; Ỹkn ) + D(PM̃k ∥QM̃k ) (187) ing (172) with the following lower bound: ≤ I(M̃k ; Ỹkn ) + D(PỸ n ∥PỸ n QM̃k−1 ) (188) k−1 M̃k−1 k−1 D(PỸ0 ···ỸK ∥PY0 ···YK ) ≤ I(M̃k ; Ỹkn ) + I(M̃k−1 ; Ỹk−1 n ) ≥ D(PỸ0 ···Ỹk ∥PY0 ···Yk ) (173) + D(PỸ n ∥PỸ n QM̃k−2 ) (189) k−2 M̃k−2 k−2 = D(PỸ0 ···Ỹk ∥PY0 ···Yk−1 PYk |Yk−1 ) (174) .. = D(PỸ0 ···Ỹk−1 ∥PỸ0 ···Ỹk−1 PỸk |Ỹk−1 ) . h i k X + EPỸ D(PỸk |Ỹk−1 ∥PYk |Yk−1 ) ≤ I(M̃i ; Ỹin ) (190) k−1 i=1 + D(PỸ0 ···Ỹk−1 ∥PY0 ···Yk−1 ) (175) Xk Xn ≥ D(PỸ0 ···Ỹk ∥PỸ0 ···Ỹk−1 PỸk |Ỹk−1 ) (176) = I(M̃i ; Ỹi,t |Ỹit−1 ) (191) ≥ I(Ỹ0 · · · Ỹk−2 ; Ỹk |Ỹk−1 ). (177) i=1 t=1 Xk X n ≤ I(M̃i Ỹ0t−1 · · · ỸKt−1 ; Ỹi,t ) (192) Proof of (142): For each k ∈ {1, . . . , K}, define Rk ’s i=1 t=1 acceptance region k X n X = I(Ũi,t ; Ỹi,t ) (193) Ak ≜ {(mk , ykn ) : gk (mk , ykn ) = 0}. (178) i=1 t=1 4272 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 k X each of them. Moreover, we prove the desired Markov chains = nI(Ũi,T ; Ỹi,T |T ) (194) of the auxiliary random variables that arise in the typical i=1 k single-letterization steps, in the asymptotic regimes of infinite ≤n X I(Ui ; Ỹi ). (195) blocklengths. The proof technique of using asymptotic Markov i=1 chains in connection with change of measure arguments can also be used to prove strong converse results of source coding Here (188) is obtained by the data processing inequality for and channel coding theorems, see [41] for first results. KL-divergence; (191) by the chain rule; and (193)–(195) by Interesting future research directions include results for the definitions of Ũi,t , Ui , Ỹi and T . other types of hypothesis testing, not necessarily testing Combined with (183), Inequality (195) establishes Inequal- against independence or not assuming a Markov chain under ity (142) for k ∈ {1, . . . , K}. the null hypothesis. Other network structures are also of practical importance. Intriguing following-up questions exist IX. D ISCUSSION AND O UTLOOK also from an optimization perspective. For example, finding We derived the optimal type-II exponents region under the optimal rate-distribution across the various links so as to expected-rate constraints for the K-hop network with K maximize a weighted sum of the exponents. decision centers (DC) for testing against independence and when the observations at the sensors respect some Markov A PPENDIX A chain. Equivalent simplified expressions were proved for K ∈ P ROOF OF L EMMA 1: C ONCAVITY AND {2, 3} and when all DCs have same admissible type-I error M ONOTONICITY OF THE F UNCTION η1 probabilities. Similar simplifications are conjectured to hold The function η1 (R) is monotonically non-decreasing also in the general case. When the various DCs have different because larger values of R imply larger optimization domains. permissible type-I errors, then the derived exponents region Continuity follows simply by the continuity of mutual illustrates a tradeoff between the error exponents that are information. simultaneously achievable at the various DCs. In general, The concavity of η1 (R) follows by the following arguments. an increase in exponents region is observed compared to the Consider rates R and R̃, and let U ∗ and Ũ ∗ be the corre- setup with maximum-rate constraints. When all DCs have sponding solutions to the optimizations in the definition of equal permissible type-I error probability ϵ, then the exponents η1 . Pick any λ ∈ [0, 1], define Q ∼ Bern(λ) independent of region degenerates to a K-dimensional hypercube meaning (Y0 , Y1 , U ∗ , Ũ ∗ ), and set that all DCs can simultaneously achieve their optimal error ( exponents. This optimal exponent coincides with the optimal ∗ U ∗ if Q = 0 UQ = (196) exponent under maximum-rate constraint where the rates have Ũ ∗ if Q = 1. to be boosted by the factor (1 − ϵ)−1 . ∗ Defining the random variable V := (UQ , Q), we obtain To achieve the optimal tradeoff, a novel coding and test- ing scheme based on multiplexing and rate-sharing is pro- λ · η1 (R) + (1 − λ) · η1 (R̃) posed. The idea is that the transmitter chooses one of 2K = λI(U ∗ ; Y1 ) + (1 − λ)I(Ũ ∗ ; Y1 ) (197) subschemes with appropriate probabilities and applies each ∗ subscheme with a well-chosen rate tuple. Notice that the = I(UQ ; Y1 |Q) (198) ∗ various rate-tuples determine the error exponents achieved at = I(UQ , Q; Y1 ) (199) the various DCs, and thus steer the tradeoff between the error = I(V ; Y1 ) (200) exponents at the different DCs. We multiplex schemes in a way ≤ η1 (I(V ; Y0 )) (201) that each of the subschemes is meant to help only a subset of the DCs in their decision; all other DCs simply raise an alarm ≤ η1 (λR + (1 − λ)R̃) (202) so as not to compromise their type-II error exponents. The where (199) holds because Q is independent of Y1 , (201) probabilities of the various subschemes then have to be chosen holds by the definition of the function η1 , and (202) holds such that the probability of each DC raising an alarm does not by the monotonicity of the function η1 and the following set exceed its permissible type-I error probability. We conjecture of (in)equalities: that it suffices to multiplex only K + 1 subschemes and that ∗ ∗ they should be chosen with probabilities determined by the I(V ; Y0 ) = I(UQ , Q; Y0 ) = I(UQ ; Y0 |Q) (203) ∗ ∗ type-I error probabilities. We managed to prove this conjecture = λI(U ; Y0 ) + (1 − λ)I(Ũ ; Y0 ) (204) for K ∈ {2, 3} and when all DCs have same permissible ≤ λR + (1 − λ)R̃. (205) type-I error probabilities, but proofs for the general case seem cumbersome. ■ Notice that the proposed multiplexing and rate-sharing strategy is also optimal for other multi-terminal hypothesis A PPENDIX B testing setups, as we show in [44]. A NALYSIS OF THE C ODING S CHEME IN Our converse proof methods rely on applying 2K change S UBSECTION IV-A FOR ϵ1 = ϵ2 = ϵ of measure arguments in parallel, and to separately bound Consider the two-hop scheme employed when Y0n ∈ D{1,2} , the achievable error exponents and the required rates for and let Ĥ{1,2},1 and Ĥ{1,2},2 denote the guesses produced at HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4273 R1 and R2 when employing this scheme for any Y0n ∈ Y0n . Notice that for Y0n ∈ D∅ both R1 and R2 decide on Ĥ1 = Notice that by assumption the type-I error probabilities of this Ĥ2 = 1. Applying the total law of probability, we can write scheme tend to 0 as n → ∞: α1,n = Pr[Ĥ1 = 1|H = 0] (218) lim Pr[Ĥ{1,2},k = 1|H = 0] = 0, k ∈ {1, 2}. (206) = Pr[Ĥ1 = 1, Y0n ∈ D∅ |H = 0] n→∞ Noticing that when Y0n ∈ D∅ , then Ĥ1 = Ĥ2 = 1, and + Pr[Ĥ1 = 1, Y0n ∈ D{1} |H = 0] applying the total law of probability, we can write for k ∈ + Pr[Ĥ1 = 1, Y0n ∈ D{1,2} |H = 0] (219) {1, 2}: = Pr[Y0n ∈ D∅ |H = 0] αk,n = Pr[Ĥk = 1|H = 0] (207) + Pr[Ĥ{1},1 = 1, Y0n ∈ D{1} |H = 0] = Pr[Ĥk = 1, Y0n ∈ D∅ |H = 0] + Pr[Ĥ{1,2},1 = 1, Y0n ∈ D{1,2} |H = 0] (220) + Pr[Ĥk = 1, Y0n ∈ D{1,2} |H = 0] (208) ≤ Pr[Y0n ∈ D∅ |H = 0] = Pr[Y0n ∈ D∅ |H = 0] + Pr[Ĥ{1},1 = 1|H = 0] + Pr[Ĥ{1,2},k = 1, Y0n ∈ D{1,2} |H = 0] (209) + Pr[Ĥ{1,2},1 = 1|H = 0] (221) ≤ Pr[Y0n ∈ D∅ |H = 0] Combining this inequality with (217), and because in the limit + Pr[Ĥ{1,2},k = 1|H = 0] (210) n → ∞ Inequality (128) turns into an equality, we conclude that the overall scheme satisfies the type-I error constraint: Combining these inequalities with (206), and because in the limit n → ∞ Inequality (17) turns into an equality, lim α1,n ≤ ϵ1 . (222) n→∞ we conclude that the overall scheme satisfies the type-I error constraints: Similarly we have: lim αk,n ≤ ϵ, k ∈ {1, 2}. (211) α2,n = Pr[Ĥ2 = 1|H = 0] (223) n→∞ = Pr[Ĥ2 = 1, Y0n ∈ (D∅ ∪ D{1} )|H = 0] For the type-II error probabilities of the overall scheme we observe for k ∈ {1, 2}: + Pr[Ĥ2 = 1, Y0n ∈ D{1,2} |H = 0] (224) = Pr[Y0n ∈ (D∅ ∪ D{1} )|H = 0] β1,n = Pr[Ĥk = 0|H = 1] (212) + Pr[Ĥ{1,2},2 = 1, Y0n ∈ D{1,2} |H = 0] (225) = Pr[Ĥk = 0, Y0n ∈ D∅ |H = 1] ≤ Pr[Y0n ∈ (D∅ ∪ D{1} )|H = 0] (226) + Pr[Ĥk = 0, Y0n ∈ D{1,2} |H = 1] (213) + Pr[Ĥ{1,2},2 = 1|H = 0]. (227) = Pr[Ĥk = 0, Y0n ∈ D{1,2} |H = 1] (214) Combining this inequality with (217), and because in the limit = Pr[Ĥ{1,2},k = 0, Y0n ∈ D{1,2} |H = 1] (215) n → ∞ Inequalities (21a) and (128) turn into equalities, ≤ Pr[Ĥ{1,2},k = 0|H = 1]. (216) we conclude that the overall scheme satisfies the type-I error constraint: The type-II error exponents of the overall scheme are thus lim α2,n ≤ ϵ2 . (228) given by the error exponents of the two-hop scheme employed n→∞ under Y0n ∈ D{1,2} . By [27] and because the two-hop scheme For the relay’s type-II error probability in the overall scheme has to have vanishing type-I error probabilities and respect the we observe: rate constraints R{1,2},1 and R{1,2},2 , the exponents in (31) are proved achievable. β1,n = Pr[Ĥ1 = 0|H = 1] (229) = Pr[Ĥ1 = 0, Y0n ∈ D∅ |H = 1] A PPENDIX C + Pr[Ĥ1 = 0, Y0n ∈ D{1} |H = 1] A NALYSIS OF THE C ODING S CHEME IN + Pr[Ĥ1 = 0, Y0n ∈ D{1,2} |H = 1] (230) S UBSECTION IV-B FOR ϵ2 > ϵ1 Consider the two-hop scheme employed when Y0n ∈ D{1,2} , = Pr[Ĥ1 = 0, Y0n ∈ D{1} |H = 1] and let Ĥ{1,2},1 and Ĥ{1,2},2 denote the guesses produced at + Pr[Ĥ1 = 0, Y0n ∈ D{1,2} |H = 1] (231) R1 and R2 when employing this scheme for any y0n ∈ Y0n . = Pr[Ĥ{1},1 = 0, Y0n ∈ D{1} |H = 1] Similarly, let Ĥ{1},1 and Ĥ{1},2 denote the guesses produced + Pr[Ĥ{1,2},1 = 0, Y0n ∈ D{1,2} |H = 1] (232) at R1 and R2 when employing the scheme for Y0n ∈ D{1} , where we again extend the scheme to the entire set Y0 . ≤ Pr[Ĥ{1},1 = 0|H = 1] By assumption, the type-I error probabilities of these + Pr[Ĥ{1,2},1 = 0|H = 1]. (233) schemes tend to 0 as n → ∞: The relay’s type-II error exponent of the overall scheme is thus lim Pr[Ĥ{1},k = 1|H = 0] = 0, k ∈ {1, 2} (217a) given by the minimum of the error exponents of the single-hop n→∞ lim Pr[Ĥ{1,2},k = 1|H = 0] = 0, k ∈ {1, 2}. (217b) scheme employed under Y0n ∈ D{1} and of two-hop scheme n→∞ employed under Y0n ∈ D{1,2} . By [4] and [27] and because 4274 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 these schemes have vanishing type-I error probabilities and where (244) holds again because the minimum is never larger respect the rate constraints R{1},1 and (R{1,2},1 , R{1,2},2 ), than any linear combination; (245) holds by the concavity of respectively, the exponent θ1 in (32) is proved achievable. the functions η1 (·) and η2 (·); and (246) holds because by (54) It remains to analyze the receiver’s type-II error exponent: we have σ{2} R{2},i + σ{1,2} R{1,2},i ≤ Ri , for i ∈ {1, 2}, and σ{2} + σ{1,2} ≥ 1 − ϵ. β2,n = Pr[Ĥ2 = 0|H = 1] (234) This concludes the converse proof to (31). = Pr[Ĥ2 = 0, Y0n ∈ (D∅ ∪ D{1} )|H = 1] 2) The Case ϵ1 < ϵ2 : Choose nonnegative numbers + Pr[Ĥ2 = 0, Y0n ∈ D{1,2} |H = 1] (235) a1 , a1,2 , b1 , b1,2 , c1,2 satisfying = Pr[Ĥ2 = 0, Y0n ∈ D{1,2} |H = 1] (236) a1 + a1,2 ≤ σ{1} (247a) = Pr[Ĥ{1,2},2 = 0, Y0n ∈ D{1,2} |H = 1] (237) b1 + b1,2 ≤ σ{1,2} (247b) ≤ Pr[Ĥ{1,2},2 = 0|H = 1]. (238) c1,2 ≤ σ{2} (247c) a1,2 + b1,2 = b1,2 + c1,2 = 1 − ϵ2 (247d) The receiver’s type-II error exponent of the overall scheme is thus given by the error exponent of the two-hop scheme a1 + b1 = ϵ2 − ϵ1 . (247e) employed under Y0n ∈ D{1,2} . By [27] and because this Notice that this set of (in)equalities is equivalent to the two scheme has vanishing type-I error probabilities and respects equalities a1,2 = c1,2 = 1−ϵ2 −b1,2 and a1 = ϵ2 −ϵ1 −b1 and the rate constraints (R{1,2},1 , R{1,2},2 ), the exponent θ2 the three inequalities: in (32) is proved achievable. 1 − ϵ1 − b1,2 − b1 ≤ σ{1} (248a) A PPENDIX D b1 + b1,2 ≤ σ{1,2} (248b) S IMPLIFICATION OF THE O UTER B OUND IN P ROPOSITION 3 1 − ϵ2 − b1,2 ≤ σ{2} . (248c) We proceed to simplify the outer bound in Proposition 3 Through the Fourier-Motzkin Elimination (FME) Algorithm, depending on the cases ϵ1 = ϵ2 , ϵ1 < ϵ2 , or ϵ1 > ϵ2 . To this it can be verified that above three inequalities (248) have end, fix an exponent pair (θ1 , θ2 ) in E ⋆ (R1 , R2 , ϵ1 , ϵ2 ), rates a nonnegative solution pair (b1 , b1,2 ), with corresponding R{1},1 , R{1,2},1 , R{1,2},2 , R{2},1 , R{2},2 ≥ 0 and numbers nonnegative values for a1,2 , c1,2 , a1 , whenever σ{1} , σ{2} , σ{1,2} ≥ 0 summing to less than 1 and satisfying constraints (54). 0 ≤ σI , I ∈ P(2), (249a) 1) The Case ϵ1 = ϵ2 : By (54): 1 − ϵi ≤ σ{i} + σ{1,2} , i ∈ {1, 2}, (249b) θ1 ≤ min{η1 (R{1},1 ), η1 (R{1,2},1 )} (239) 0 ≤ ϵ2 − ϵ1 , (249c) σ{1} η1 (R{1},1 ) + σ{1,2} η1 (R{1,2},1 ) which hold by assumption, see (54). The existence of the ≤ (240) σ{1} + σ{1,2} desired nonnegative numbers a1 , a1,2 , b1 , b1,2 , c1,2 satisfy- σ{1} R{1},1 + σ{1,2} R{1,2},1 ing (247) is thus established. ≤ η1 (241) σ{1} + σ{1,2} With the chosen numbers, we form ≤ η1 (R1 /(1 − ϵ)) , (242) a1,2 R{1},1 + b1,2 R{1,2},1 R̃{1,2},1 := max , 1 − ϵ2 where (240) holds because the minimum is never larger than b1,2 R{1,2},1 + c1,2 R{2},1 any linear combination; (241) holds by the concavity of the , function η1 (·); and (242) holds by the monotonicity of the 1 − ϵ2 function η1 (·) and because by (54) we have σ{1} R{1},1 + (250a) σ{1,2} R{1,2},1 ≤ R1 and σ{1} + σ{1,2} ≥ 1 − ϵ. b1,2 R{1,2},2 + c1,2 R{2},2 R̃{1,2},2 := , (250b) Following similar steps, one can prove that 1 − ϵ2 n a1 R{1},1 + b1 R{1,2},1 θ2 ≤ min η1 R{1,2},1 + η2 R{1,2},2 , R̃{1},1 := . (250c) ϵ2 − ϵ1 o η1 R{2},1 + η2 R{2},2 (243) We show that exponents (θ1 , θ2 ) and rates R̃{1},1 , R̃{1,2},1 σ{2} η1 R{2},1 + σ{2} η2 R{2},2 and R̃{1,2},2 satisfy constraints (32). To this end, notice that ≤ σ{2} + σ{1,2} θ1 ≤ min{η1 (R{1},1 ), η1 (R{1,2},1 )} (251) σ{1,2} η1 R{1,2},1 + σ{1,2} η2 R{1,2},2 a1 η1 (R{1},1 ) + b1 η1 (R{1,2},1 ) + (244) ≤ (252) σ{2} + σ{1,2} ϵ2 − ϵ1 σ{2} R{2},1 + σ{1,2} R{1,2},1 a1 R{1},1 + b1 R{1,2},1 ≤ η1 ≤ η1 (253) σ{2} + σ{1,2} ϵ − ϵ1 2 σ{2} R{2},2 + σ{1,2} R{1,2},2 ≤ η1 R̃{1},1 , (254) + η2 (245) σ{2} + σ{1,2} where (252) holds because the minimum is smaller than any ≤ η1 (R1 /(1 − ϵ)) + η2 (R2 /(1 − ϵ)) , (246) linear combination and because a1 + b1 = ϵ2 − ϵ1 ; (253) holds HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4275 by the concavity of the function η1 (·); and (254) holds by the where inequalities (265) and (269) hold because a1 + a1,2 ≤ definition of rate R{1},1 . In a similar way we have: σ{1} , c1,2 ≤ σ{2} , and b1 +b1,2 ≤ σ{1,2} , see (247); and (267) and (271) hold by the definitions of rates R̃{1},1 , R̃{1,2},1 , and θ1 ≤ min η1 R{1},1 , η1 R{1,2},1 (255) R̃{1,2},2 and because a1,2 η1 R{1},1 + b1,2 η1 R{1,2},1 ≤ (256) 1 − ϵ2 a1,2 R{1},1 + c1,2 R{2},1 + b1,2 R{1,2},1 a1,2 R{1},1 + b1,2 R{1,2},1 ≥ max{a1,2 R{1},1 + b1,2 R{1,2},1 ; ≤ η1 (257) 1 − ϵ2 b1,2 R{1,2},1 + c1,2 R{2},1 }. (272) ≤ η1 R̃{1,2},1 , (258) The desired converse result to (32) then follows by com- where the last step holds by the monotonicity of the bining (259), (263), (267), and (271), and by noticing that function η1 (·) and because by definition R̃{1,2},1 ≥ by the monotonicity of the function η2 (·) there is no loss in a1,2 R{1},1 +b1,2 R{1,2},1 optimality to restrict to rates R̃{1,2},2 = R2 /(1 − ϵ2 ). 1−ϵ2 . Thus, by (254) and (258): n o 3) The Case ϵ1 > ϵ2 : The proof is similar to the case ϵ1 < θ1 ≤ min η1 R̃{1},1 , η1 R̃{1,2},1 . (259) ϵ2 . We present it here for completeness. Choose nonnegative numbers a1,2 , b2 , b1,2 , c2 , c1,2 We continue to notice satisfying n θ2 ≤ min η1 R{1,2},1 + η2 R{1,2},2 , a1,2 ≤ σ{1} (273a) o b2 + b1,2 ≤ σ{1,2} (273b) η1 R{2},1 + η2 R{2},2 (260) c2 + c1,2 ≤ σ{2} (273c) b1,2 η1 R{1,2},1 + b1,2 η2 R{1,2},2 ≤ a1,2 + b1,2 = b1,2 + c1,2 = 1 − ϵ1 (273d) 1− ϵ2 b2 + c2 = ϵ1 − ϵ2 , (273e) c1,2 η1 R{2},1 + c1,2 η2 R{2},2 + (261) 1 − ϵ2 which is equivalent to the three equalities a1,2 = c1,2 = 1 − b1,2 R{1,2},1 + c1,2 R{2},1 ≤ η1 ϵ1 − b1,2 and c2 = ϵ1 − ϵ2 − b2 and the three inequalities 1 − ϵ2 b1,2 R{1,2},2 + c1,2 R{2},2 1 − ϵ1 − b1,2 ≤ σ{1} (274a) + η2 (262) 1 − ϵ2 b2 + b1,2 ≤ σ{1,2} (274b) ≤ η1 R̃{1,2},1 + η2 R̃{1,2},2 , (263) 1 − ϵ2 − b2 − b1,2 ≤ σ{2} . (274c) where (261) holds because the minimum is smaller than any Through FME it can be shown that a nonnegative pair linear combination and because b1,2 + c1,2 = 1 − ϵ2 ; (262) (b2 , b1,2 ) satisfying (274) exists and the corresponding values holds concavity of the functions η1 (·) and η2 (·); and (263) for a1,2 , c1,2 , c2 are nonnegative whenever holds by the definitions of rates R{1,2},1 and R{1,2},2 and by the monotonicity of the function η1 (·). 0 ≤ σI , I ∈ P(2), (275a) From the rate constraints in (54), we further obtain 1 − ϵi ≤ σ{i} + σ{1,2} , i ∈ {1, 2}, (275b) R1 ≥ σ{1} R{1},1 + σ{2} R{2},1 + σ{1,2} R{1,2},1 (264) 0 ≤ ϵ1 − ϵ2 , (275c) ≥ (a1 + a1,2 )R{1},1 + c1,2 R{2},1 + (b1 + b1,2 )R{1,2},1 which hold by assumption, see (54). (265) Define the new rates a1 R{1},1 + b1 R{1,2},1 = (ϵ2 − ϵ1 ) a1,2 R{1},1 + b1,2 R{1,2},1 ϵ2 − ϵ1 R̃{1,2},1 := max , a1,2 R{1},1 + c1,2 R{2},1 + b1,2 R{1,2},1 1 − ϵ1 + (1 − ϵ2 ) b1,2 R{1,2},1 + c1,2 R{2},1 1 − ϵ2 , 1 − ϵ1 (266) (276) ≥ (ϵ2 − ϵ1 )R̃{1},1 + (1 − ϵ2 )R̃{1,2},1 (267) b1,2 R{1,2},2 + c1,2 R{2},2 R̃{1,2},2 := , (277) and 1 − ϵ1 b2 R{1,2},i + c2 R{2},i R2 ≥ σ{1,2} R{1,2},2 + σ{2} R{2},2 (268) R̃{2},i := , i ∈ {1, 2} ϵ1 − ϵ2 ≥ b1,2 R{1,2},2 + c1,2 R{2},2 (269) (278) b1,2 R{1,2},2 + c1,2 R{2},2 = (1 − ϵ2 ) (270) We show that the exponents θ1 , θ2 and the rates R̃{2},1 ,R̃{2},2 , 1 − ϵ2 R̃{1,2},1 and R̃{1,2},2 satisfy constraints (33). To this end, = (1 − ϵ2 )R̃{1,2},2 , (271) notice that by similar arguments as in the preceding 4276 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 subsections: and θ1 ≤ min η1 R{1},1 , η1 R{1,2},1 (279) R2 ≥ σ{1,2} R{1,2},2 + σ{2} R{2},2 (295) a1,2 η1 R{1},1 + b1,2 η1 R{1,2},1 ≥ (b2 + b1,2 )R{1,2},2 + (c2 + c1,2 )R{2},2 (296) ≤ (280) 1 − ϵ1 b1,2 R{1,2},2 + c1,2 R{2},2 = (1 − ϵ1 ) a1,2 R{1},1 + b1,2 R{1,2},1 1 − ϵ1 ≤ η1 (281) 1 − ϵ1 b2 R{1,2},2 + c2 R{2},2 + (ϵ1 − ϵ2 ) (297) ≤ η1 R̃{1,2},1 . (282) 1 − ϵ1 = (1 − ϵ1 )R̃{1,2},2 + (ϵ1 − ϵ2 )R̃{2},2 . (298) Moreover, Combining (282), (290), (294), and (298) establishes the θ2 ≤ min η1 R{1,2},1 + η2 R{1,2},2 , desired converse result in (33). η1 R{2},1 ) + η2 (R{2},2 ) (283) A PPENDIX E b1,2 η1 R{1,2},1 + b1,2 η2 R{1,2},2 A NALYSIS OF THE C ODING S CHEME IN S ECTION VII-C ≤ 1− ϵ1 Consider the ℓ∗I -hop hypothesis testing scheme employed c1,2 η1 R{2},1 + c1,2 η2 R{2},2 when Y0n ∈ DI , for I ∈ P(K). For any I ∈ P(K), + (284) 1 − ϵ1 let ĤI,1 , . . . , ĤI,ℓ∗I denote the guesses produced at terminals 1, . . . , ℓ∗I when employing this scheme. b1,2 R{1,2},1 + c1,2 R{2},1 ≤ η1 By assumption, the type-I error probabilities of these deci- 1 − ϵ1 b1,2 R{1,2},2 + c1,2 R{2},2 sions tend to 0 as n → ∞ for any I ∈ P(K): + η2 (285) 1 − ϵ1 lim Pr[ĤI,k = 1|H = 0, Y0n ∈ DI ] = 0, k ∈ I. n→∞ ≤ η1 R̃{1,2},1 + η2 R̃{1,2},2 (286) (299) and Recalling that decision center k declares Ĥk = 1 whenever Y0n ∈ D∅ or Y0n ∈ DI for a set I not containing k, and b2 η1 R{1,2},1 + b2 η2 R{1,2},2 θ2 ≤ applying the total law of probability, we can write ϵ1 − ϵ2 αk,n = Pr[Ĥk = 1|H = 0] (300) c2 η1 R{2},1 + c2 η2 R{2},2 + (287) X ϵ1 − ϵ2 = Pr[Ĥk = 1, Y0n ∈ DI |H = 0] (301) b2 R{1,2},1 + c2 R{2},1 I∈(P(K)∪∅) ≤ η1 ϵ1 − ϵ2 X = Pr[Y0n ∈ D∅ |H = 0]+ Pr[Y0n ∈ DI |H = 0] b2 R{1,2},2 + c2 R{2},2 I∈P(K) : + η2 (288) k∈I / ϵ1 − ϵ2 X + Pr[Ĥk = 1, Y0n ∈ DI |H = 0] (302) ≤ η1 R̃{2},1 + η2 R̃{2},2 . (289) I∈P(K) : k∈I Combining (286) and (289) we obtain: X ≤ Pr[Y0n ∈ D∅ |H = 0]+ Pr[Y0n ∈ DI |H = 0] θ2 ≤ min η1 R̃{1,2},1 + η1 R̃{1,2},2 , I∈P(K) : k∈I / X + Pr[ĤI,k = 1|H = 0, Y0n ∈ DI ]. (303) η1 R̃{2},1 + η1 R̃{2},2 . (290) I∈P(K) : k∈I From the rate constraints in (54), inequalities (273), and Combining this inequality with (299), and by Inequali- the definitions of the rates R̃{2},1 , R̃{2},2 , R̃{1,2},1 , R̃{1,2},2 , ties (109), we conclude that the overall scheme satisfies the we obtain: type-I error constraints: R1 ≥ σ{1} R{1},1 + σ{1,2} R{1,2},1 + σ{2} R{2},1 (291) lim αk,n ≤ ϵk , k ∈ {1, . . . , K}. (304) n→∞ ≥ a1,2 R{1},1 + (c2 + c1,2 )R{2},1 + (b2 + b1,2 )R{1,2},1 For the type-II error exponent at a decision center k we (292) observe: b2 R{1,2},1 + c2 R{2},1 = (ϵ1 − ϵ2 ) βk,n = Pr[Ĥk = 0|H = 1] (305) ϵ1 − ϵ2 X Pr[Ĥk = 0, Y0n ∈ DI |H = 1] a1,2 R{1},1 + c1,2 R{2},1 + b1,2 R{1,2},1 = (306) + (1 − ϵ1 ) 1 − ϵ1 I∈(P(K)∪∅) X (293) = Pr[ĤI,k = 0, Y0n ∈ DI |H = 1] (307) ≥ (ϵ1 − ϵ2 )R̃{2},1 + (1 − ϵ1 )R̃{1,2},1 (294) I∈P(K) : k∈I HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4277 X ≤ Pr[ĤI,k = 0|H = 1, Y0n ∈ DI ]. (308) We next show that one can further restrict to nonnegative I∈P(K) : numbers satisfying also (369). To this end, assume that (369) k∈I is violated and define Defining a := σ̃{π(1),π(3)} − σ̃{π(2)} − σ̃{π(1),π(2)} > 0. (321) 1 θk,I := lim − log Pr[ĤI,k = 0|H = 1, Y0n ∈ DI ], (309) Define also the new parameters n→∞ n ′ we conclude by (308) that the exponent σ{1,2,3} := σ̃{1,2,3} + a (322) ′ min θk,I (310) σ{π(3)} := σ̃{π(3)} + a (323) I∈P(K) : ′ k∈I σ{π(1),π(3)} := σ̃{π(1),π(3)} − a (324) ′ is achievable at decision center k. This proves in particular that σ{π(2),π(3)} := σ̃{π(2),π(3)} − a (325) when applying an instance of the multi-hop scheme in [28] for σI′ := σ̃I , π(3) ∈ / I, (326) each set I ∈ P(K), the exponents θ1 , . . . , θK in (5) are proved achievable. and the new rates a λℓ R̃{π(1),π(3)},ℓ + (1 − λℓ )R̃{π(2),π(3)},ℓ A PPENDIX F ′ R{1,2,3},ℓ = ′ P ROOF OF L EMMA 5 σ{1,2,3} To show sufficiency of (368), start by fixing σ̃{1,2,3} R̃{1,2,3},ℓ + ′ , ℓ ∈ {1, 2, 3}, any set of nonnegative numbers {σI }I∈P(3) , and σ{1,2,3} {RI,1 , . . . , RI,ℓ∗I }I∈P(3) satisfying (116) for K = 3, (327a) (and possibly violating (368)). Choose new nonnegative numbers σ̃{1,2,3} , σ̃{π(1),π(2)} , σ̃{π(1),π(3)} , σ̃{π(1)} satisfying a (1 − λℓ )R̃{π(1),π(3)},ℓ + λℓ R̃{π(2),π(3)},ℓ ′ R{π(3)},ℓ = ′ σ̃I ≤ σI , ∀I : π(1) ∈ I, (311) σ{π(3)} σ̃{1,2,3} + σ̃{π(1),π(2)} ≥ 1 − ϵπ(1) − ϵπ(2) (312) σ̃{π(3)} R̃{π(3)},ℓ + ′ , ℓ ∈ {1, . . . , π(3)} σ̃{1,2,3} + σ̃{π(1),π(3)} ≥ 1 − ϵπ(1) − ϵπ(3) (313) σ{π(3)} and (327b) ′ σ̃{1,2,3} + σ̃{π(1),π(2)} + σ̃{π(1),π(3)} + σ̃{π(1)} = 1−ϵπ(1) . RI,ℓ = R̃I,ℓ , I ∈ P(3)\{{1, 2, 3}, {π(3)}}. (327c) (314) Notice that by the definition of a and by (320), the parameters {σI′ } are all nonnegative, and it is easily verified that they The existence of the desired numbers can be checked by continue to satisfy (116) for any choice of λ1 , λ2 , λ3 ∈ [0, 1]. applying the Fourier-Motzkin Elimination algorithm [45] and We next choose the parameters λ1 , λ2 , λ3 ∈ [0, 1] in by noting Constraints (116). Further choose for any set I function of the rates {R̃I,ℓ } and the ordering π(·), and show containing π(1) and ℓ ∈ {1, 2, 3} the rate: that for the proposed choice of rates in (327), the exponents R̃I,ℓ := RI,ℓ , (315) θ1 , θ2 , θ3 are only increased. We distinguish three cases. For notational simplicity we assume π(1) < π(2). (The and for any set I not containing π(1) and ℓ ∈ {1, 2, 3}: proof for π(1) > π(2) is analogous.) This implies that σ̃I := σI + σIπ(1) − σ̃Iπ(1) (316) 1 = π(1) < π(3) or 1 = π(3) < π(2) = 2 (328) σI σIπ(1) − σ̃Iπ(1) R̃I,ℓ := RI,ℓ + RIπ(1) ,ℓ , (317) and σ̃I σ̃I where we defined Iπ(1) := I ∪ {π(1)}. 2 = π(2) < π(3) = 3 or π(3) < π(2) = 3. (329) By Lemma 3, the new set of numbers {σ̃I }I∈P(3) , Case 1: If and {R̃I,1 , . . . , R̃I,ℓ∗I }I∈P(3) also satisfies Constraints (116), which proves that one can restrict to numbers {σI }I∈P(3) η1 R̃{π(1),π(3)},1 ≤ η1 R̃{π(2),π(3)},1 , (330) satisfying (368). Since ϵπ(1) ≥ ϵπ(2) and choose σ{1,2,3} +σ{π(1),π(2)} +σ{π(2),π(3)} +σ{π(2)} ≥ 1 − ϵπ(2) , λℓ = 0, ℓ ∈ {1, . . . , π(3)}, (331) (318) n o λℓ = 1 R̃{π(1),π(3)},ℓ ≥ R̃{π(2),π(3)},ℓ , this further implies that one can restrict to numbers ℓ ∈ {π(3) + 1, . . . , 3}. {σI }I∈P(3) satisfying (332) σ{π(2),π(3)} ≥ σ{π(1),π(3)} + σ{π(1)} − σ{π(2)} (319) Using the same proof steps as in Lemma 3, it can be shown ≥ σ{π(1),π(3)} − σ{π(2)} − σ{π(1),π(2)} . that for this choice of the λs the new rates in (327) still satisfy (320) Constraint (116a) for θπ(3) because λ1 = · · · = λπ(3) . 4278 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 π(2) To see that they satisfy (116a) also for θπ(1) , notice that: a X n + ′ max ηℓ R̃{π(1),π(3)},ℓ , σ{1,2,3} π(1) π(1) ℓ=π(3)+1 X X o min ηℓ R̃{π(1),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ ηℓ R̃{π(2),π(3)},ℓ ℓ=1 ℓ=1 π(2) π(1) σ̃{1,2,3} X X + ηℓ R̃{1,2,3},ℓ ≤ min ηℓ R̃{π(1),π(3)},ℓ , ′ σ{1,2,3} ℓ=1 ℓ=1 π(1) (337) a X ′ ηℓ R̃{π(1),π(3)},ℓ π(2) π(2) X σ{1,2,3} X ′ ′ ℓ=1 ≤ min ηℓ R{π(2),π(3)},ℓ , ηℓ R{1,2,3},ℓ , π(1) ℓ=1 ℓ=1 σ̃{1,2,3} X + ′ ηℓ R̃{1,2,3},ℓ (333) (338) σ{1,2,3} ℓ=1 where notice that the sum in the second line of (337) is empty π(1) X when π(2) ≤ π(3). Here, the last inequality holds by the ≤ min ηℓ R̃{π(1),π(3)},ℓ , ′ definitions of the rates {R{1,2,3},ℓ } and by the choice of the λs ℓ=1 and the concavity and monotonicity of the functions {ηℓ (·)}ℓ . a ′ η1 R̃{π(2),π(3)},1 Case 2: If σ{1,2,3} 2 2 a X X + ′ 1 {π(1) = 2} ηℓ R̃{π(2),π(3)},ℓ ≤ ηℓ R̃{π(1),π(3)},ℓ , (339) σ{1,2,3} ℓ=1 ℓ=1 n o · max η2 R̃{π(1),π(3)},2 , η2 R̃{π(2),π(3)},2 choose π(1) λℓ = 1, ℓ ∈ {1, . . . , max{2, π(3)}}, (340) σ̃{1,2,3} X n o + ′ σ{1,2,3} ηℓ R̃{1,2,3},ℓ λℓ = 1 R̃{π(1),π(3)},ℓ ≥ R̃{π(2),π(3)},ℓ , ℓ=1 ℓ ∈ {max{2, π(3)} + 1, . . . , 3}. (341) (334) π(1) X π(1) X Using similar arguments as in the previous case, one can ′ ′ conclude that the new rates in (327) still satisfy (116a). More ≤ min ηℓ R{π(1),π(3)},ℓ , ηℓ R{1,2,3},ℓ . ℓ=1 ℓ=1 specifically, since λ1 = · · · = λπ(3) = 1 by (340), similar (335) proof steps as in Lemma 3 can be used to show that (116a) holds for θπ(3) . where the second inequality holds by Assumption (330) and To see that (116a) holds for θπ(2) , recall that π(2) ≥ 2 and the third inequality holds by the definitions of the rates notice: ′ {R{1,2,3},ℓ } and by the concavity and monotonicity of the π(2) X π(2) X functions {ηℓ (·)}. min ηℓ R̃{π(2),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ Similarly, we notice for θπ(2) : ℓ=1 ℓ=1 π(2) π(2) π(2) X X X ≤ min ηℓ R̃{π(2),π(3)},ℓ , min ηℓ R̃{π(2),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ ℓ=1 ℓ=1 ℓ=1 π(2) a π(2) X X ′ ηℓ R̃{π(2),π(3)},ℓ ≤ min ηℓ R̃{π(2),π(3)},ℓ , σ{1,2,3} ℓ=1 ℓ=1 π(2) π(2) σ̃{1,2,3} X a X + ′ ηℓ R̃{1,2,3},ℓ (342) ηℓ R̃{π(2),π(3)},ℓ σ{1,2,3} ′ σ{1,2,3} ℓ=1 ℓ=1 π(2) π(2) X σ̃{1,2,3} X ≤ min ηℓ R̃{π(2),π(3)},ℓ , + ′ ηℓ R̃{1,2,3},ℓ (336) σ{1,2,3} ℓ=1 ℓ=1 2 π(2) a X X ′ ηℓ R̃{π(1),π(3)},ℓ ≤ min ηℓ R̃{π(2),π(3)},ℓ , σ{1,2,3} ℓ=1 a 1{π(2) = 3} ℓ=1 + ′ min{π(2),π(3)} σ{1,2,3} a X ′ ηℓ R̃{π(2),π(3)},ℓ n o σ{1,2,3} · max η3 R̃{π(1),π(3)},3 , η3 R̃{π(2),π(3)},3 ℓ=1 HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4279 π(1) σ̃{1,2,3} X because of Assumption (347). For λ = 1, relation (352) holds + ′ ηℓ R̃{1,2,3},ℓ (343) with a < sign because of Assumption (348). By the continuity σ{1,2,3} ℓ=1 of the functions {ηℓ (·)} and the intermediate value theorem, π(2) X π(2) X there is thus a value λ ∈ (0, 1) such that (352) holds with ′ ′ ≤ min ηℓ R{π(2),π(3)},ℓ , ηℓ R{1,2,3},ℓ , equality. Let λ be this value and notice that by the concavity of the functions {ηℓ (·)}: ℓ=1 ℓ=1 (344) η1 R̃{π(1),π(3)},1 +η2 R̄{π(1),π(3)},2 where the second inequality holds by our assumption (339) and since π(2) ≥ 2. +η1 R̃{π(2),π(3)},1 + η2 R̄{π(2),π(3)},2 Finally, (116a) holds for θπ(1) , because: 2 X X2 ≥ ηℓ R̃{π(1),π(3)},ℓ + ηℓ R̃{π(2),π(3)},ℓ , (354) π(1) X π(1) X ℓ=1 ℓ=1 min ηℓ R̃{π(1),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ which combined with (352) implies (353). ℓ=1 ℓ=1 Now that we established the existence of the desired value π(1) X λ, we continue to show that for the choice in (349), Con- ≤ min ηℓ R̃{π(1),π(3)},ℓ , straints (116a) remain valid. For θπ(1) this can be verified ℓ=1 through the following steps, where recall that π(1) ≤ 2: π(1) a X π(1) X π(1) X ′ ηℓ R̃{π(1),π(3)},ℓ min ηℓ R̃{π(1),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ σ{1,2,3} ℓ=1 ℓ=1 ℓ=1 π(1) σ̃{1,2,3} X π(1) + ′ ηℓ R̃{1,2,3},ℓ (345) X σ{1,2,3} ≤ min ηℓ R̃{π(1),π(3)},ℓ , ℓ=1 ℓ=1 π(1) X π(1) X π(1) ′ ′ a ≤ min ηℓ R{π(1),π(3)},ℓ , ηℓ R{1,2,3},ℓ . X ′ ηℓ R̃{π(1),π(3)},ℓ σ{1,2,3} ℓ=1 ℓ=1 ℓ=1 (346) π(1) σ̃{1,2,3} X where the second inequality holds by the assumption π(1) < + ′ ηℓ R̃{1,2,3},ℓ (355) σ{1,2,3} π(2) and thus π(1) ≤ 2. ℓ=1 Case 3: Else, i.e., if π(1) X = min ηℓ R̃{π(1),π(3)},ℓ , η1 R̃{π(1),π(3)},1 > η1 R̃{π(2),π(3)},1 (347) ℓ=1 and a ′ η1 R̃{π(1),π(3)},1 2 X2 σ{1,2,3} 1 {π(1) = 2} · η2 R̄{π(1),π(3)},2 X ηℓ R̃{π(1),π(3)},ℓ < ηℓ R̃{π(2),π(3)},ℓ . (348) + ℓ=1 ℓ=1 π(1) σ̃{1,2,3} X Choose + ′ ηℓ R̃{1,2,3},ℓ (356) λ1 = 1, λ2 = λ, and λ3 = 0, (349) σ{1,2,3} ℓ=1 for a value of λ ∈ [0, 1] so that the auxiliary rates π(1) X π(1) X ′ ′ ≤ min ηℓ R{π(1),π(3)},ℓ , ηℓ R{1,2,3},ℓ , R̄{π(1),π(3)},2 := λR{π(1),π(3)},2 + (1 − λ)R{π(2),π(3)},2 (350) ℓ=1 ℓ=1 R̄{π(2),π(3)},2 := (1 − λ)R{π(1),π(3)},2 + λR{π(2),π(3)},2 (351) (357) satisfy where the second inequality holds since π(1) ≤ 2, and by (348) and (352), and the last inequality holds by the η1 R̃{π(1),π(3)},1 + η2 R̄{π(1),π(3)},2 ′ definitions of the rates {R{1,2,3},ℓ }, the choice of the λs, and 2 X the concavity and monotonicity of the functions {ηℓ (·)}. = ηℓ R̃{π(2),π(3)},ℓ (352) To verify that Constraint (116a) remains valid for θπ(2) , ℓ=1 recall that π(2) ≥ 2 and notice: η1 R̃{π(2),π(3)},1 + η2 R̄{π(2),π(3)},2 π(2) π(2) X X 2 X min ηℓ R̃{π(2),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ ≥ ηℓ R̃{π(1),π(3)},ℓ . (353) ℓ=1 ℓ=1 ℓ=1 π(2) X Existence of the desired choice of λ can be seen as follows. ≤ min ηℓ R̃{π(2),π(3)},ℓ , Notice first that for λ = 0, relation (352) holds with a > sign ℓ=1 4280 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 π(2) a a X + ′ η2 R̄{π(1),π(3)},2 ′ ηℓ R̃{π(2),π(3)},ℓ σ{1,2,3} σ{1,2,3} ℓ=1 a π(2) + ′ 1{π(3) = 3} . ηℓ R̃{π(2),π(3)},ℓ σ̃{1,2,3} X σ{1,2,3} + ′ ηℓ R̃{1,2,3},ℓ (358) π(3) σ{1,2,3} σ̃{1,2,3} ℓ=1 X + ′ ηℓ R̃{1,2,3},ℓ σ{1,2,3} π(2) X ℓ=1 = min ηℓ R̃{π(2),π(3)},ℓ , (363) ℓ=1 2 π(3) π(3) X a X ′ ′ ≤ min ηℓ R{π(2),π(3)},ℓ , ηℓ R{1,2,3},ℓ , X ′ ηℓ R̃{π(2),π(3)},ℓ σ{1,2,3} ℓ=1 ℓ=1 ℓ=1 a + ′ 1 {π(2) = 3} · η3 R̃{π(2),π(3)},3 (364) σ{1,2,3} π(2) where the equality holds by Assumption (352). σ̃{1,2,3} X + ′ ηℓ R̃{1,2,3},ℓ (359) Notice further that σ{1,2,3} ℓ=1 π(3) π(3) π(2) X X min ηℓ R̃{π(1),π(3)},ℓ , ηℓ R̃{π(3)},ℓ X = min ηℓ R̃{π(2),π(3)},ℓ , ℓ=1 ℓ=1 ℓ=1 a π(3) X ′ η1 R̃{π(1),π(3)},1 +η2 R̄{π(1),π(3)},2 ≤ min ηℓ R̃{π(1),π(3)},ℓ , σ{1,2,3} ℓ=1 a + ′ 1 {π(2) = 3} · η3 R̃{π(2),π(3)},3 π(3) σ{1,2,3} a X ′ ηℓ R̃{π(1),π(3)},ℓ π(2) σ{π(3)} σ̃{1,2,3} X ℓ=1 + ′ ηℓ R̃{1,2,3},ℓ (360) π(3) σ{1,2,3} σ̃{π(3)} X ℓ=1 + ′ ηℓ R̃{π(3)},ℓ (365) π(2) σ{π(3)} π(2) X X ℓ=1 ′ ′ ≤ min ηℓ R{π(2),π(3)},ℓ , ηℓ R{1,2,3},ℓ . π(3) X ℓ=1 ℓ=1 ≤ min ηℓ R̃{π(1),π(3)},ℓ , (361) ℓ=1 a ′ η1 R̃{π(2),π(3)},1 Here, the second equality holds by (352). σ{π(3)} Finally, to see that Constraint (116a) is also satisfied for a + ′ η2 R̄{π(2),π(3)},2 θπ(3) , we distinguish two cases. If π(3) = 1, the proof is σ{π(3)} similar to the proof of Lemma 3 because λ1 = 1. For the a proof in the case π(3) ≥ 2, notice first: + ′ 1{π(3) = 3} . η3 R̃{π(1),π(3)},3 σ{π(3)} π(3) σ̃{π(3)} X π(3) X π(3) X + ′ ηℓ R̃{1,2,3},ℓ min ηℓ R̃{π(2),π(3)},ℓ , ηℓ R̃{1,2,3},ℓ σ{π(3)} ℓ=1 ℓ=1 ℓ=1 (366) π(3) π(3) π(3) X ≤ min ηℓ R̃{π(2),π(3)},ℓ , X X ′ ′ ≤ min ηℓ R{π(1),π(3)},ℓ , ηℓ R{π(3)},ℓ , ℓ=1 ℓ=1 ℓ=1 π(3) a X (367) ′ ηℓ R̃{π(2),π(3)},ℓ σ{1,2,3} ℓ=1 where the second inequality holds by (353). π(3) σ̃{1,2,3} X + ′ ηℓ R̃{1,2,3},ℓ (362) σ{1,2,3} ℓ=1 A PPENDIX G π(3) X C ONVERSE P ROVE TO P ROPOSITION 7 = min ηℓ R̃{π(2),π(3)},ℓ , ℓ=1 We start with two auxiliary lemmas. a Lemma 5: Let K = 3. In Theorem 5 ′ η1 R̃{π(1),π(3)},1 σ{1,2,3} it suffices to consider values {σI }I∈P(3) so HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4281 that k X X δI,J σ{1,2,3} + σ{π(1),π(2)} + σ{π(1),π(3)} + σ{π(1)} = 1 − ϵπ(1) ≤ ηℓ · RI,ℓ (380) cJ ℓ=1 I∈P(K) : (368) k∈I k σ{π(1),π(2)} +σ{π(2)} ≥ σ{π(1),π(3)} X ≤ ηℓ R̃J ,ℓ , (381) (369) ℓ=1 Proof: See Appendix F. We thus continue with nonnegative numbers {σI }I∈P(3) , where (378) holds because the minimum of a set of numbers and {RI,1 , . . . , RI,ℓ∗I }I∈P(3) satisfying (116) for K = 3 as is never larger than any convex combination of these numbers; well as (369). The proof of the desired proposition follows (379) holds by the concavity of the functions η1 (·), . . . , ηk (·); by the next lemma (which holds for any positive integer K) (380) holds by assumption (373) and by the monotonicity of and by an appropriate choice of parameters {cJ }, see (390) the functions η1 (·), . . . , ηk (·); and (381) holds by the definition ahead. of R̃J ,k in (374) because k ≥ ℓ and k ∈ J thus ℓ ≤ ℓ∗J . Lemma 6: Let To prove (376), fix k ∈ {1, . . . , K} and for each subset J ⊆ P(K) with ℓ∗J ≥ k pick an index jJ ∈ J so that {cJ : J ∈ P(K)}, (370) jJ ≥ k. Then, by (116b): {δI,J : I, J ∈ P(K) and I ∩ J = ̸ ∅} (371) X be sets of nonnegative integers satisfying Rk ≥ σI · RI,k , (382) I∈P(K) : k≤ℓ∗ X δI,J ≤ σI , I ∈ P(K), (372) X I X J ∈P(K) : ≥ δI,J · RI,k (383) I∩J ̸=∅ I∈P(K) : J ∈P(K) : k≤ℓ∗ I∩J ̸=∅ and X I X X = δI,J · RI,k (384) δI,J ≥ cJ , ∀k ∈ J , J ∈ P(K). (373) J ∈P(K) I∈P(K) : I∈P(K) : k≤ℓ∗I k∈I I∩J ̸=∅ X X Then, the rates ≥ δI,J · RI,k (385) X δI,J J ∈P(K) : I∈P(K) : R̃J ,k := max RI,k , k≤ℓ∗ J k≤ℓ∗I j∈J : cJ I∩J ̸=∅ j≥k I∈P(K) : X X j∈I ≥ δI,J RI,k (386) k≤ ℓ∗J , J ∈ P(K). (374) J ∈P(K) : I∈P(K) : k≤ℓ∗ J k≤ℓ∗I satisfy the following inequalities: jJ ∈I X X k X = δI,J RI,k (387) θk ≤ min ηℓ R̃J ,ℓ , k ∈ {1, . . . , K}, (375) J ∈P(K) : I∈P(K) : J ∈P(K) : k≤ℓ∗ jJ ∈I ℓ=1 J k∈J P I∈P(K) : δI,J RI,k and X jJ ∈I = cJ · , (388) X cJ Rk ≥ cJ · R̃J ,k , k ∈ {1, . . . , K}. (376) J ∈P(K) : k≤ℓ∗ J J ∈P(K) : k≤ℓ∗ J Proof: We start by proving (375). By (116a), for any where (383) holds by Assumption (372); inequalities (385) k ∈ {1, . . . , K} and any set J ⊆ P(K) containing index k: and (386) hold because we consider less summands and each summand is nonnegative (recall that jJ ∈ J ); and finally (g) k X holds because the two conditions jJ ≥ k and jJ ∈ I imply θk ≤ min ηℓ (RI,ℓ ) (377) that ℓ∗I ≥ k. I∈P(K) : k∈I ℓ=1 The proof of the lemma is concluded by recalling the defi- k nition of rate R̃J ,k in (374) and noting that Inequality (381) X δI,J X ≤ P · ηℓ (RI,ℓ ) (378) holds for any set J containing k whereas Inequality (388) I∈P(K) : δI,J I∈P(K) : k∈I ℓ=1 holds for any index jJ ∈ J larger than k. k∈I To obtain the desired simplification in Proposition 7 from k Theorem 5, define the subsets X X δI,J ≤ ηℓ P · RI,ℓ I∈P(K) : δI,J ℓ=1 I∈P(K) : k∈I Jk := {π(k), . . . , π(K)}, k ∈ {1, . . . , K}, k∈I (379) (389) 4282 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 69, NO. 7, JULY 2023 k and the values π(0) := 0 and ϵ0 := 1. Applying above X Rℓ ≤ ηℓ (397) Lemma 6 to the choice ϕk ( ℓ=1 ϵπ(k−1) − ϵπ(k) , J = Jk , k Rℓ cJ := (390) X 0, otherwise, ≤ ηℓ , (398) 1−ϵ ℓ=1 establishes the converse to Conjecture 6 for general values of where (395) holds because the minimum of a set of numbers K, if one renames rates R̃Jk ,ℓ as R̃k,ℓ . The proof is concluded is smaller than any convex combination thereof; (396) holds by showing that above parameter choice is permissible, i.e., by the concavity of the functions ηℓ (·); (397) holds by the that there exist nonnegative numbers {δI,J } satisfying condi- rate constraints (116b); and (398) holds by (393). This estab- tions (372) and (373) for {cJ } in (390). For general values lishes the desired converse. Achievability follows by setting of K this seems cumbersome. σ{1,...,K} = 1 − ϵ and all other σI = 0 in (116a). For K = 3, this can be achieved by means of the Fourier-Motzkin Elimination algorithm [45], which shows the existence of nonnegative numbers {δI,J } satisfying condi- R EFERENCES tions (372) and (373) for {cJ } in (390), whenever (redundant [1] M. Hamad, M. Wigger, and M. Sarkiss, “Optimal exponents in cascaded conditions are omitted) hypothesis testing under expected rate constraints,” in Proc. IEEE Inf. Theory Workshop (ITW), Oct. 2021, pp. 1–6. σ{1,2,3} + σ{π(1),π(2)} + σ{π(1),π(3)} + σ{π(1)} ≥ 1 − ϵπ(1) [2] M. Hamad, M. Wigger, and M. Sarkiss, “Two-hop network with multiple decision centers under expected-rate constraints,” in Proc. IEEE Global (391a) Commun. Conf. (GLOBECOM), Dec. 2021, pp. 1–6. [3] R. Ahlswede and I. Csiszár, “Hypothesis testing with communication σ{1,2,3} + σ{π(1),π(2)} + σ{π(2),π(3)} + σ{π(2)} ≥ 1 − ϵπ(2) constraints,” IEEE Trans. Inf. Theory, vol. IT-32, no. 4, pp. 533–542, (391b) Jul. 1986. [4] T. Han, “Hypothesis testing with multiterminal data compression,” IEEE σ{1,2,3} + σ{π(1),π(3)} + σ{π(2),π(3)} + σ{π(3)} ≥ 1 − ϵπ(3) Trans. Inf. Theory, vol. IT-33, no. 6, pp. 759–772, Nov. 1987. [5] T. Han and K. Kobayashi, “Exponential-type error probabilities for (391c) multiterminal hypothesis testing,” IEEE Trans. Inf. Theory, vol. 35, no. 1, pp. 2–14, Jan. 1989. and [6] H. M. H. Shalaby and A. Papamarcou, “Multiterminal detection with zero-rate data compression,” IEEE Trans. Inf. Theory, vol. 38, no. 2, 2σ{1,2,3} + 2σ{π(1),π(2)} + σ{π(1),π(3)} + σ{π(2),π(3)} pp. 254–267, Mar. 1992. +σ{π(1)} + σ{π(2)} + σ{π(3)} ≥ 1 − ϵπ(1) + 1 − ϵπ(3) . [7] H. Shimokawa, T. S. Han, and S. Amari, “Error bound of hypothesis testing with data compression,” in Proc. IEEE Int. Symp. Inf. Theory, (391d) Jul. 1994, p. 114. [8] M. S. Rahman and A. B. Wagner, “On the optimality of binning for Since Conditions (391a)–(391c) are satisfied by Assump- distributed hypothesis testing,” IEEE Trans. Inf. Theory, vol. 58, no. 10, pp. 6282–6303, Oct. 2012. tion (116c) and Condition (391d) is implied by (391a), (391b), [9] N. Weinberger, Y. Kochman, and M. Wigger, “Exponent trade-off for and (369), this concludes the proof for K = 3 and thus hypothesis testing over noisy channels,” in Proc. IEEE Int. Symp. Inf. establishes Proposition 7. Theory (ISIT), Jul. 2019, pp. 1852–1856. [10] S. Watanabe, “On sub-optimality of random binning for distributed hypothesis testing,” 2022, arXiv:2201.13005. A PPENDIX H [11] S. Sreekumar and D. Gündüz, “Distributed hypothesis testing over C ONVERSE FOR THE C ASE ϵ1 = · · · = ϵK discrete memoryless channels,” IEEE Trans. Inf. Theory, vol. 66, no. 4, pp. 2044–2066, Apr. 2020. Define X [12] S. Salehkalaibar and M. Wigger, “Distributed hypothesis testing based ϕk ≜ σI (392) on unequal-error protection codes,” IEEE Trans. Inf. Theory, vol. 66, pp. 4150–4182, 2020. I∈P(K) : [13] S. Sreekumar and D. Gunduz, “Strong converse for testing against k∈I independence over a noisy channel,” in Proc. IEEE Int. Symp. Inf. Theory and notice that by (116c): (ISIT), Jun. 2020, pp. 1283–1288. [14] S. Sreekumar, D. Gunduz, and A. Cohen, “Distributed hypothesis testing ϕk ≥ 1 − ϵk = 1 − ϵ. (393) under privacy constraints,” in Proc. IEEE Inf. Theory Workshop (ITW), Nov. 2018, pp. 1–5. [15] A. Gilani, S. Belhadj Amor, S. Salehkalaibar, and V. Y. F. Tan, “Dis- By (116a) we have for any k ∈ {1, . . . , K}: tributed hypothesis testing with privacy constraints,” Entropy, vol. 21, k no. 5, p. 478, May 2019. [16] Y. Xiang and Y.-H. Kim, “Interactive hypothesis testing with com- X θk ≤ min ηℓ (RI,ℓ ), (394) munication constraints,” in Proc. 50th Annu. Allerton Conf. Commun., I∈P(K) : k∈I ℓ=1 Control, Comput. (Allerton), Oct. 2012, pp. 1065–1072. k [17] Y. Xiang and Y.-H. Kim, “Interactive hypothesis testing against indepen- X X σI dence,” in Proc. IEEE Int. Symp. Inf. Theory, Jun. 2013, pp. 2840–2844. ≤ ηℓ (RI,ℓ ) (395) [18] G. Katz, P. Piantanida, and M. Debbah, “Collaborative distributed ϕk ℓ=1 I∈P(K) : hypothesis testing with general hypotheses,” in Proc. IEEE Int. Symp. k∈I Inf. Theory (ISIT), Jul. 2016, pp. 1705–1709. [19] G. Katz, P. Piantanida, and M. Debbah, “Distributed binary detection k with lossy data compression,” IEEE Trans. Inf. Theory, vol. 63, no. 8, X X σI ≤ ηℓ RI,ℓ (396) pp. 5207–5227, Aug. 2017. ϕk [20] W. Zhao and L. Lai, “Distributed detection with vector quantizer,” IEEE ℓ=1 I∈P(K) : k∈I Trans. Signal Inf. Process. Netw., vol. 2, no. 2, pp. 105–119, Jun. 2016. HAMAD et al.: MULTI-HOP NETWORK WITH MULTIPLE DECISION CENTERS UNDER EXPECTED-RATE CONSTRAINTS 4283 [21] W. Zhao and L. Lai, “Distributed testing with cascaded encoders,” IEEE [42] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Trans. Inf. Theory, vol. 64, no. 11, pp. 7339–7348, Nov. 2018. Discrete Memoryless Systems. Cambridge, U.K.: Cambridge Univ. Press, [22] M. Wigger and R. Timo, “Testing against independence with multi- 2011. ple decision centers,” in Proc. Int. Conf. Signal Process. Commun. [43] M. Hamad, M. Wigger, and M. Sarkiss, “Cooperative multi-sensor (SPCOM), Jun. 2016, pp. 1–5. detection under variable-length coding,” 2020, arXiv:2010.09616. [23] S. Salehkalaibar, M. Wigger, and R. Timo, “On hypothesis testing [44] M. Hamad, M. Sarkiss, and M. Wigger, “Benefits of rate-sharing for against conditional independence with multiple decision centers,” IEEE distributed hypothesis testing,” 2022, arXiv:2202.02282. Trans. Commun., vol. 66, no. 6, pp. 2409–2420, Jun. 2018. [45] I. B. Gattegno, Z. Goldfeld, and H. H. Permuter, “Fourier–Motzkin [24] P. Escamilla, M. Wigger, and A. Zaidi, “Distributed hypothesis testing elimination software for information theoretic inequalities,” 2016, with concurrent detection,” in Proc. ISIT, Jun. 2018, pp. 166–170. arXiv:1610.03990. [25] P. Escamilla, A. Zaidi, and M. Wigger, “Distributed hypothesis testing with collaborative detection,” in Proc. 56th Annu. Allerton Conf. Com- mun., Control, Comput. (Allerton), Oct. 2018, pp. 512–518. [26] P. Escamilla, M. Wigger, and A. Zaidi, “Distributed hypothesis testing: Mustapha Hamad (Member, IEEE) received the B.E. degree in computer Cooperation and concurrent detection,” IEEE Trans. Inf. Theory, vol. 66, engineering and the M.Sc. degree (Hons.) in computer and communication no. 12, pp. 7550–7564, Dec. 2020. engineering from Lebanese American University (LAU), Byblos, Lebanon, in [27] S. Salehkalaibar, M. Wigger, and L. Wang, “Hypothesis testing over 2015 and 2018, respectively, and the Ph.D. degree in networks, information, the two-hop relay network,” IEEE Trans. Inf. Theory, vol. 65, no. 7, and communications from Télécom Paris, Institut Polytechnique de Paris, pp. 4411–4433, Jul. 2019. France, in 2022. Between 2016 and 2019, he was a Systems and Security [28] S. Salehkalaibar, M. Wigger, and L. Wang, “Hypothesis testing over the Engineer at Computer Business Machines (CBM), a Partner Company of IBM, two-hop relay network,” 2017, arXiv:1708.05198. Lebanon, and a Research Assistant at LAU. Since November 2022, he has [29] D. Cao, L. Zhou, and V. Y. F. Tan, “Strong converse for hypothesis been a Senior Research Engineer at the Huawei Mathematical and Algorithmic testing against independence over a two-hop network,” Entropy, vol. 21, Sciences Laboratory, Paris Research Center, France. His research interests no. 12, p. 1171, Nov. 2019. are information theory, mainly distributed hypothesis testing, communication theory, semantic communications, and explainable machine learning. [30] S. Salehkalaibar and M. Wigger, “Distributed hypothesis testing with variable-length coding,” IEEE J. Sel. Areas Inf. Theory, vol. 1, no. 3, pp. 681–694, Nov. 2020. [31] S. Salehkalaibar and V. Y. F. Tan, “Distributed sequential hypothesis testing with zero-rate compression,” in Proc. IEEE Inf. Theory Workshop Michèle Wigger (Senior Member, IEEE) received the M.Sc. degree (Hons.) (ITW), Oct. 2021, pp. 1–5. in electrical engineering and the Ph.D. degree in electrical engineering from ETH Zurich in 2003 and 2008, respectively. In 2009, she was a Post-Doctoral [32] Y. Inan, M. Kayaalp, A. H. Sayed, and E. Telatar, “A fundamental Fellow at the University of California at San Diego, San Diego, USA, and limit of distributed hypothesis testing under memoryless quantization,” then joined Télécom Paris, France, where she is currently a Full Professor. in Proc. IEEE Int. Conf. Commun., May 2021, p. 8. She has held a visiting professor appointments at the Technion–Israel Institute [33] S. Salehkalaibar and M. Wigger, “Distributed hypothesis testing with of Technology, University of Zurich, and ETH Zurich. Her research interests variable-length coding,” in Proc. 18th Int. Symp. Model. Optim. Mobile, are in multi-terminal information theory, in particular in distributed source Ad Hoc, Wireless Netw. (WiOPT), 2020, pp. 1–5. coding and in capacities of networks and in the limits of distributed hypothesis [34] H. Tyagi and S. Watanabe, “Strong converse using change of measure testing. Previously, she has served as an Associate Editor of the IEEE arguments,” IEEE Trans. Inf. Theory, vol. 66, no. 2, pp. 689–703, C OMMUNICATION L ETTERS and is currently serving as an Associate Editor Feb. 2020. for the IEEE T RANSACTIONS ON I NFORMATION T HEORY (Shannon Theory). [35] W. Gu and M. Effros, “A strong converse for a collection of net- During 2016 to 2019, she also served on the Board of Governors of the IEEE work source coding problems,” in Proc. IEEE Int. Symp. Inf. Theory, Information Theory Society. Jun. 2009, pp. 2316–2320. [36] W. Gu and M. Effros, “A strong converse in source coding for super- source networks,” in Proc. IEEE Int. Symp. Inf. Theory Proc., Jul. 2011, pp. 395–399. Mireille Sarkiss (Member, IEEE) received the Engineering degree in telecom- [37] J. Liu, R. van Handel, and S. Verdu, “Beyond the blowing-up lemma: munications and computer science from the Faculty of Engineering, Lebanese Sharp converses via reverse hypercontractivity,” in Proc. IEEE Int. Symp. University, Lebanon, in 2003, and the M.Sc. and Ph.D. degrees in communi- Inf. Theory (ISIT), Jun. 2017, pp. 943–947. cations and electronics from Télécom Paris, France, in 2004 and 2009, respec- [38] K. Marton, “A simple proof of the blowing-up lemma,” IEEE Trans. Inf. tively. She was a Doctoral Researcher at the Orange Laboratories from 2004 to Theory, vol. IT-32, no. 3, pp. 445–446, May 1986. 2007 and a Post-Doctoral Researcher at Télécom Paris from 2009 to 2010. [39] A. D. Wyner and J. Ziv, “The rate-distortion function for source coding From 2010 to 2018, she was a Researcher at the Communicating Systems with side information at the decoder,” IEEE Trans. Inf. Theory, vol. IT- Laboratory, CEA LIST, France. Since December 2018, she has been an 22, no. 1, pp. 1–10, Jan. 1976. Associate Professor at Télécom SudParis, Institut Polytechnique de Paris, [40] Y. Oohama, “Exponential strong converse for source coding with side France. Her research interests are in communication theory and information information at the decoder,” Entropy, vol. 20, no. 5, p. 352, May 2018. theory, including resource allocation, optimization and reinforcement learning [41] M. Hamad, M. Wigger, and M. Sarkiss, “Strong converses using change for mobile-edge computing and energy-efficient wireless networks, physical of measure and asymptotic Markov chains,” 2022, arXiv:2205.08910. layer security, and distributed hypothesis testing.