TY - GEN
T1 - Coded retransmission in wireless networks via abstract MDPs
T2 - IEEE International Symposium on Information Theory, ISIT 2015
AU - Shifrin, Mark
AU - Cohen, Asaf
AU - Weisman, Olga
AU - Gurewitz, Omer
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/9/28
Y1 - 2015/9/28
N2 - Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible. While such multiple-unicast models are unsolved in general, several network coding based schemes were suggested. In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is a function of a few packets. Sent packets can be received by the designated receiver (with some probability) or heard and stored by other receivers. Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second it is limited by a given Time To Live (TTL) parameter. We model the transmission process as an infinite-horizon Markov Decision Process (MDP). Since the large state space renders exact solutions computationally impractical, we introduce policy restricted and induced MDPs with significantly reduced state space, which with properly chosen reward have equal optimal value function. We then derive a reinforcement learning algorithm, which approximates the optimal strategy and significantly improves over uncoded schemes. The algorithm adapts to the packet loss rates, unknown in advance, attains high gain over the uncoded setup and is comparable with the upper bound by Wang, derived for a much stronger coding scheme.
AB - Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible. While such multiple-unicast models are unsolved in general, several network coding based schemes were suggested. In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is a function of a few packets. Sent packets can be received by the designated receiver (with some probability) or heard and stored by other receivers. Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second it is limited by a given Time To Live (TTL) parameter. We model the transmission process as an infinite-horizon Markov Decision Process (MDP). Since the large state space renders exact solutions computationally impractical, we introduce policy restricted and induced MDPs with significantly reduced state space, which with properly chosen reward have equal optimal value function. We then derive a reinforcement learning algorithm, which approximates the optimal strategy and significantly improves over uncoded schemes. The algorithm adapts to the packet loss rates, unknown in advance, attains high gain over the uncoded setup and is comparable with the upper bound by Wang, derived for a much stronger coding scheme.
UR - http://www.scopus.com/inward/record.url?scp=84969822493&partnerID=8YFLogxK
U2 - 10.1109/ISIT.2015.7282932
DO - 10.1109/ISIT.2015.7282932
M3 - Conference contribution
AN - SCOPUS:84969822493
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 2628
EP - 2632
BT - Proceedings - 2015 IEEE International Symposium on Information Theory, ISIT 2015
PB - Institute of Electrical and Electronics Engineers
Y2 - 14 June 2015 through 19 June 2015
ER -