References

Vikram Krishnamurthy

doi:10.1017/CBO9781316471104.028

References

Published online by Cambridge University Press: 05 April 2016

Vikram Krishnamurthy

Show author details

Vikram Krishnamurthy: Affiliation:
Cornell University/Cornell Tech

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Type: Chapter
Information: Partially Observed Markov Decision Processes
From Filtering to Controlled Sensing
, pp. 455 - 470

DOI: https://doi.org/10.1017/CBO9781316471104.028 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] D., Aberdeen and J., Baxter, Scaling internal-state policy-gradient methods for POMDPs. In International Conference on Machine Learning, pp. 3–10, 2002.Google Scholar

[2] J., Abounadi, D. P., Bertsekas and V., Borkar, Learning algorithms for Markov decision processes with average cost. SIAM Journal on Control and Optimization, 40(3):681–98, 2001.Google Scholar

[3] D., Acemoglu and A., Ozdaglar, Opinion dynamics and learning in social networks. Dynamic Games and Applications, 1(1):3–49, 2011.Google Scholar

[4] S., Afriat, The construction of utility functions from expenditure data. International Economic Review, 8(1):67–77, 1967.Google Scholar

[5] S., Afriat, Logic of Choice and Economic Theory. (Oxford: Clarendon Press, 1987).

[6] S., Agrawal and N., Goyal, Analysis of Thompson sampling for the multi-armed bandit problem. In Proceedings 25th Annual Conference Learning Theory, volume 23, 2012.Google Scholar

[7] R., Ahuja and J., Orlin, Inverse optimization. Operations Research, 49(5):771–83, 2001.Google Scholar

[8] I. F., Akyildiz, W., Su, Y., Sankarasubramaniam and E., Cayirci, Wireless sensor networks: A survey. Computer Networks, 38(4):393–422, 2002.Google Scholar

[9] A., Albore, H., Palacios and H., Geffner, A translation-based approach to contingent planning. In International Joint Conference on Artificial Intelligence, pp. 1623–28, 2009.Google Scholar

[10] S. C., Albright, Structural results for partially observed Markov decision processes. Operations Research, 27(5):1041–53, Sept.–Oct. 1979.Google Scholar

[11] E., Altman, Constrained Markov Decision Processes. (London: Chapman and Hall, 1999).

[12] E., Altman, B., Gaujal and A., Hordijk, Discrete-Event Control of Stochastic Networks: Multimodularity and Regularity. (Springer-Verlag, 2004).

[13] T., Ben-Zvi and A., Grosfeld-Nir, Partially observed Markov decision processes with binomial observations. Operations Research Letters, 41(2):201–6, 2013.Google Scholar

[14] M., Dorigo and M., Gambardella, Ant-q: A reinforcement learning approach to the traveling salesman problem. In Proceedings of the 12th International Conference on Machine Learning, pp. 252–60, 2014.Google Scholar

[15] R., Amir, Supermodularity and complementarity in economics: An elementary survey. Southern Economic Journal, 71(3):636–60, 2005.Google Scholar

[16] M. S., Andersland and D., Teneketzis, Measurement scheduling for recursive team estimation. Journal of Optimization Theory and Applications, 89(3):615–36, June 1996.Google Scholar

[17] B. D. O., Anderson and J. B., Moore, Optimal Filtering. (Englewood Cliffs, NJ: Prentice Hall, 1979).

[18] B. D. O., Anderson and J. B., Moore, Optimal Control: Linear Quadratic Methods. (Englewood Cliffs, NJ: Prentice Hall, 1989).

[19] S., Andradottir, A global search method for discrete stochastic optimization. SIAM Journal on Optimization, 6(2):513–30, May 1996.Google Scholar

[20] S., Andradottir, Accelerating the convergence of random search methods for discrete stochastic optimization. ACM Transactions on Modelling and Computer Simulation, 9(4):349–80, Oct. 1999.Google Scholar

[21] A., Arapostathis, V., Borkar, E., Fernández-Gaucherand, M. K., Ghosh and S. I., Marcus, Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM Journal on Control and Optimization, 31(2):282–344, 1993.Google Scholar

[22] P., Artzner, F., Delbaen, J., Eber and D., Heath, Coherent measures of risk. Mathematical Finance, 9(3):203–28, July 1999.Google Scholar

[23] P., Artzner, F., Delbaen, J., Eber, D., Heath and H., Ku, Coherent multiperiod risk adjusted values and bellmans principle. Annals of Operations Research, 152(1):5–22, 2007.Google Scholar

[24] K. J., Åström, Optimal control of Markov processes with incomplete state information. Journal of Mathematical Analysis and Applications, 10(1):174–205, 1965.Google Scholar

[25] R., Atar and O., Zeitouni, Lyapunov exponents for finite state nonlinear filtering. SIAM Journal on Control and Optimization, 35(1):36–55, 1997.Google Scholar

[26] S., Athey, Monotone comparative statics under uncertainty. The Quarterly Journal of Economics, 117(1):187–223, 2002.Google Scholar

[27] P., Auer, N., Cesa-Bianchi and P., Fischer, Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235–56, 2002.Google Scholar

[28] A., Young and S., Russell, Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning, pp. 663–70, 2000.Google Scholar

[29] A., Banerjee, A simple model of herd behavior. Quaterly Journal of Economics, 107(3):797–817, August 1992.Google Scholar

[30] A., Banerjee, X., Guo and H., Wang, On the optimality of conditional expectation as a Bregman predictor. IEEE Transactions on Information Theory, 51(7):2664–9, 2005.Google Scholar

[31] T., Banerjee and V., Veeravalli, Data-efficient quickest change detection with on-off observation control. Sequential Analysis, 31:40–77, 2012.Google Scholar

[32] Y., Bar-Shalom, X. R., Li and T., Kirubarajan, Estimation with Applications to Tracking and Navigation. John Wiley, New York, 2008.

[33] J. S., Baras and A., Bensoussan, Optimal sensor scheduling in nonlinear filtering of diffusion processes. SIAM Journal Control and Optimization, 27(4):786–813, July 1989.Google Scholar

[34] G., Barles and P. E., Souganidis, Convergence of approximation schemes for fully nonlinear second order equations. In Asymptotic Analysis, number 4, pp. 2347–9, 1991.Google Scholar

[35] P., Bartlett and J., Baxter, Estimation and approximation bounds for gradient-based reinforcement learning. Journal of Computer and System Sciences, 64(1):133–50, 2002.Google Scholar

[36] M., Basseville and I.V., Nikiforov, Detection of Abrupt Changes — Theory and Applications. Information and System Sciences Series. (Englewood Cliffs, NJ: Prentice Hall, 1993).

[37] N., Bäuerle and U., Rieder, More risk-sensitive Markov decision processes. Mathematics of Operations Research, 39(1):105–20, 2013.Google Scholar

[38] L. E., Baum and T., Petrie, Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics, 37:1554–63, 1966.Google Scholar

[39] L. E., Baum, T., Petrie, G., Soules and N., Weiss, A maximisation technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41(1):164–71, 1970.Google Scholar

[40] R., Bellman, Dynamic Programming. 1st edition (Princeton, NJ: Princeton University Press, 1957).

[41] M., Benaim and M., Faure, Consistency of vanishingly smooth fictitious play. Mathematics of Operations Research, 38(3):437–50, Aug. 2013.Google Scholar

[42] M., Benaim, J., Hofbauer and S., Sorin, Stochastic approximations and differential inclusions. SIAM Journal on Control and Optimization, 44(1):328–48, 2005.Google Scholar

[43] M., Benaim, J., Hofbauer and S., Sorin, Stochastic approximations and differential inclusions, Part II: Applications. Mathematics of Operations Research, 31(3):673–95, 2006.Google Scholar

[44] M., Benaim and J., Weibull, Deterministic approximation of stochastic evolution in games. Econometrica, 71(3):873–903, 2003.Google Scholar

[45] V. E., Benes̆, Exact finite-dimensional filters for certain diffusions with nonlinear drift. Stochastics, 5:65–92, 1981.Google Scholar

[46] A., Bensoussan, Stochastic Control of Partially Observable Systems. (Cambridge University Press, 1992).

[47] A., Bensoussan and J., Lions, Impulsive Control and Quasi-Variational Inequalities. (Paris: Gauthier-Villars, 1984).

[48] A., Benveniste, M., Metivier and P., Priouret, Adaptive Algorithms and Stochastic Approximations, volume 22 of Applications of Mathematics. (Springer-Verlag, 1990).

[49] D. P., Bertsekas, Dynamic Programming and Optimal Control, volume 1 and 2. (Belmont, MA: Athena Scientific, 2000).

[50] D. P., Bertsekas, Nonlinear Programming. (Belmont, MA: Athena Scientific, 2000).

[51] D. P., Bertsekas, Dynamic programming and suboptimal control: A survey from ADP to MPC. European Journal of Control, 11(4):310–34, 2005.Google Scholar

[52] D. P., Bertsekas and S. E., Shreve, Stochastic Optimal Control: The Discrete-Time Case. (New York, NY: Academic Press, 1978).

[53] D. P., Bertsekas and J. N., Tsitsiklis, Neuro-Dynamic Programming. (Belmont,MA: Athena Scientific, 1996).

[54] D. P., Bertsekas and H., Yu, Q-learning and enhanced policy iteration in discounted dynamic programming. Mathematics of Operations Research, 37(1):66–94, 2012.Google Scholar

[55] L., Bianchi, M., Dorigo, L., Gambardella and W., Gutjahr, A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing: An International Journal, 8(2):239–87, 2009.Google Scholar

[56] S., Bikchandani, D., Hirshleifer and I., Welch, A theory of fads, fashion, custom, and cultural change as information cascades. Journal of Political Economy, 100(5):992–1026, October 1992.Google Scholar

[57] P., Billingsley, Statistical inference for Markov processes, volume 2. (University of Chicago Press, 1961).

[58] P., Billingsley, Convergence of Probability Measures. (New York, NY: John Wiley, 1968).

[59] P., Billingsley, Probability and Measure. (New York, NY: John Wiley, 1986).

[60] S., Blackman and R., Popoli, Design and Analysis of Modern Tracking Systems. (Artech House, 1999).

[61] R., Bond, C., Fariss, J., Jones, A., Kramer, C., Marlow, J., Settle and J., Fowler, A 61-millionperson experiment in social influence and political mobilization. Nature, 489:295–8, September 2012.Google Scholar

[62] J. G., Booth and J. P., Hobert, Maximizing generalized linear mixed model likelihoods with an automated monte carlo em algorithm. Journal Royal Statistical Society, 61:265–85, 1999.Google Scholar

[63] V. S., Borkar, Stochastic Approximation. A Dynamical Systems Viewpoint. (Cambridge University Press, 2008).

[64] S., Bose, G., Orosel, M., Ottaviani and L., Vesterlund, Dynamic monopoly pricing and herding. The RAND Journal of Economics, 37(4):910–28, 2006.Google Scholar

[65] S., Boucheron, G., Lugosi and P., Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence. (Oxford University Press, 2013).

[66] S., Boyd, P., Diaconis and L., Xiao, Fastest mixing Markov chain on a graph. SIAM Review, 46(4):667–89, 2004.Google Scholar

[67] S., Boyd, N., Parikh, E., Chu, B., Peleato and J., Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, 2011.Google Scholar

[68] S., Boyd and L., Vandenberghe, Convex Optimization. (Cambridge University Press, 2004).

[69] P., Bremaud. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. (Springer-Verlag, 1999).

[70] R. W., Brockett and J. M. C., Clarke. The geometry of the conditional density equation. In O. L. R., Jacobs et al., editor, Analysis and Optimization of Stochastic Systems, pp. 299–309 (New York, 1980).

[71] S., Bubeck and N., Cesa-Bianchi, Regret analysis of stochastic and nonstochastic multiarmed bandit problems. arXiv preprint arXiv:1204.5721, 2012.

[72] S., Bundfuss and M., Dür, Algorithmic copositivity detection by simplicial partition. Linear Algebra and Its Applications, 428(7):1511–23, 2008.Google Scholar

[73] S., Bundfuss and M., Dür, An adaptive linear approximation algorithm for copositive programs. SIAM Journal on Optimization, 20(1):30–53, 2009.Google Scholar

[74] P. E., Caines. Linear Stochastic Systems. (John Wiley, 1988).

[75] E. J., Candès and T., Tao, The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053–80, May 2009.Google Scholar

[76] O., Cappe, E., Moulines and T., Ryden, Inference in Hidden Markov Models. (Springer-Verlag, 2005).

[77] A. R., Cassandra, Tony's POMDP page. www.cs.brown.edu/research/ai/pomdp/

[78] A. R., Cassandra, Exact and Approximate Algorithms for Partially Observed Markov Decision Process. PhD thesis, Dept. Computer Science, Brown University, 1998.

[79] A. R., Cassandra, A survey of POMDP applications. In Working Notes of AAAI 1998 Fall Symposium on Planning with Partially ObservableMarkov Decision Processes, pp. 17–24, 1998.Google Scholar

[80] A. R., Cassandra, L., Kaelbling and M. L., Littman, Acting optimally in partially observable stochastic domains. In AAAI, volume 94, pp. 1023–8, 1994.Google Scholar

[81] A. R., Cassandra, M. L., Littman and N. L., Zhang, Incremental pruning: A simple fast exact method for partially observed Markov decision processes. In Proceedings of the 13th Annual Conference on Uncertainty in Artificial Intelligence (UAI-97). (Providence, RI 1997).

[82] C. G., Cassandras and S., Lafortune, Introduction to Discrete Event Systems. (Springer-Verlag, 2008).

[83] O., Cavus and A., Ruszczynski, Risk-averse control of undiscounted transient Markov models. SIAM Journal on Control and Optimization, 52(6):3935–66, 2014.Google Scholar

[84] C., Chamley, Rational Herds: Economic Models of Social Learning. (Cambridge University Press, 2004).

[85] C., Chamley, A., Scaglione and L., Li, Models for the diffusion of beliefs in social networks: An overview. IEEE Signal Processing Magazine, 30(3):16–29, 2013.Google Scholar

[86] W., Chiou, A note on estimation algebras on nonlinear filtering theory. Systems and Control Letters, 28:55–63, 1996.Google Scholar

[87] J. M. C., Clark, The design of robust approximations to the stochastic differential equations of nonlinear filtering. In J. K., Skwirzynski, editor, Communication Systems and Random Processes Theory, Darlington 1977. (Alphen aan den Rijn: Sijthoff and Noordhoff, 1978).

[88] T. F., Coleman and Y., Li, An interior trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization, 6(2):418–45, 1996.Google Scholar

[89] T. M., Cover and M. E., Hellman, The two-armed-bandit problem with time-invariant finite memory. IEEE Transactions on Information Theory, 16(2):185–95, 1970.Google Scholar

[90] T. M., Cover and J. A., Thomas, Elements of Information Theory. (Wiley-Interscience, 2006).

[91] A., Dasgupta, R., Kumar and D., Sivakumar, Social sampling. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, 235–43, (Beijing, 2012). ACM.

[92] M. H. A., Davis, On a multiplicative functional transformation arising in nonlinear filtering theory. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 54:125–39, 1980.Google Scholar

[93] S., Dayanik and C., Goulding, Detection and identification of an unobservable change in the distribution of a Markov-modulated random sequence. IEEE Transactions on Information Theory, 55(7):3323–45, 2009.Google Scholar

[94] A. P., Dempster, N. M., Laird and D. B., Rubin, Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B, 39:1–38, 1977.Google Scholar

[95] E., Denardo and U., Rothblum, Optimal stopping, exponential utility, and linear programming. Mathematical Programming, 16(1):228–44, 1979.Google Scholar

[96] C., Derman, G. J., Lieberman and S.M., Ross, Optimal system allocations with penalty cost. Management Science, 23(4):399–403, December 1976.Google Scholar

[97] R., Douc, E., Moulines and T., Ryden, Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime. The Annals of Statistics, 32(5):2254–304, 2004.Google Scholar

[98] A., Doucet, N., De Freitas and N., Gordon, editors, Sequential Monte Carlo Methods in Practice. (Springer-Verlag, 2001).

[99] A., Doucet, S., Godsill and C., Andrieu, On sequential Monte-Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10:197–208, 2000.Google Scholar

[100] A., Doucet, N., Gordon and V., Krishnamurthy, Particle filters for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49:613–24, 2001.Google Scholar

[101] A., Doucet and A. M., Johansen, A tutorial on particle filtering and smoothing: Fiteen years later. In D., Crisan and B., Rozovsky, editors, Oxford Handbook on Nonlinear Filtering. (Oxford University Press, 2011).Google Scholar

[102] E., Dynkin, Controlled random sequences. Theory of Probability & Its Applications, 10(1):1–14, 1965.Google Scholar

[103] J. N., Eagle, The optimal search for a moving target when the search path is constrained. Operations Research, 32:1107–15, 1984.Google Scholar

[104] R. J., Elliott, L., Aggoun and J. B., Moore, Hidden Markov Models – Estimation and Control. (New York, NY: Springer-Verlag, 1995).

[105] R. J., Elliott and V., Krishnamurthy, Exact finite-dimensional filters for maximum likelihood parameter estimation of continuous-time linear Gaussian systems. SIAM Journal on Control and Optimization, 35(6):1908–23, November 1997.Google Scholar

[106] R. J., Elliott and V., Krishnamurthy, New finite dimensional filters for estimation of discretetime linear Gaussian models. IEEE Transactions on Automatic Control, 44(5):938–51, May 1999.Google Scholar

[107] Y., Ephraim and N., Merhav, Hidden Markov processes. IEEE Transactions on Information Theory, 48:1518–69, June 2002.Google Scholar

[108] S. N., Ethier and T. G., Kurtz, Markov Processes—Characterization and Convergence. (Wiley, 1986).

[109] J., Evans and V., Krishnamurthy, Hidden Markov model state estimation over a packet switched network. IEEE Transactions on Signal Processing, 42(8):2157–66, August 1999.Google Scholar

[110] R., Evans, V., Krishnamurthy and G., Nair, Networked sensor management and data rate control for tracking maneuvering targets. IEEE Transactions on Signal Processing, 53(6):1979–91, June 2005.Google Scholar

[111] M., Fanaswala and V., Krishnamurthy, Syntactic models for trajectory constrained trackbefore-detect. IEEE Transactions on Signal Processing, 62(23):6130–42, 2014.Google Scholar

[112] M., Fanaswalla and V., Krishnamurthy, Detection of anomalous trajectory patterns in target tracking via stochastic context-free grammars and reciprocal process models. IEEE Journal on Selected Topics Signal Processing, 7(1):76–90, Feb. 2013.Google Scholar

[113] M., Fazel, H., Hindi and S. P., Boyd, Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. In Proceedings of the 2003 American Control Conference, 2003.

[114] E., Feinberg and A., Shwartz, editors, Handbook of Markov Decision Processes. (Springer-Verlag, 2002).

[115] J. A., Fessler and A. O., Hero. Space–Alternating Generalized Expectation–Maximization algorithm. IEEE Transactions on Signal Processing, 42(10):2664–77, 1994.Google Scholar

[116] J., Filar, L., Kallenberg and H., Lee, Variance-penalized Markov decision processes. Mathematics of Operations Research, 14(1):147–61, 1989.Google Scholar

[117] W. H., Fleming and H. M., Soner, Controlled Markov Processes and Viscosity Solutions, volume 25. (Springer Science & Business Media, 2006).

[118] A., Fostel, H., Scarf and M., Todd. Two new proofs of Afriat's theorem. Economic Theory, 24(1):211–19, 2004.Google Scholar

[119] D., Fudenberg and D. K., Levine, The Theory of Learning in Games. (MIT Press, 1998).

[120] D., Fudenberg and D. K., Levine, Consistency and cautious fictitious play. Journal of Economic Dynamics and Control, 19(5-7):1065–89, 1995.Google Scholar

[121] F. R., Gantmacher, Matrix Theory, volume 2. (New York, NY: Chelsea Publishing Company, 1960).

[122] A., Garivier and E., Moulines. On upper-confidence bound policies for switching bandit problems. In Algorithmic Learning Theory, pages 174–188. Springer, 2011.

[123] E., Gassiat and S., Boucherone, Optimal error exponents in hidden Markov models order estimation. IEEE Transactions on Information Theory, 49(4):964–80, 2003.Google Scholar

[124] D., Ghosh, Maximum likelihood estimation of the dynamic shock-error model. Journal of Econometrics, 41(1):121–43, 1989.Google Scholar

[125] J. C., Gittins, Multi–Armed Bandit Allocation Indices. (Wiley, 1989).

[126] S., Goel and M. J., Salganik, Respondent-driven sampling as Markov chain Monte Carlo. Statistics in Medicine, 28:2209–29, 2009.Google Scholar

[127] G., Golubev and R., Khasminskii, Asymptotic optimal filtering for a hidden Markov model. Math. Methods Statist., 7(2):192–208, 1998.Google Scholar

[128] N. J., Gordon, D. J., Salmond and A. F. M., Smith, Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140(2):107–13, 1993.Google Scholar

[129] M., Granovetter, Threshold models of collective behavior. American Journal of Sociology, 83(6):1420–43, May 1978.Google Scholar

[130] A., Grosfeld-Nir, Control limits for two-state partially observable Markov decision processes. European Journal of Operational Research, 182(1):300–4, 2007.Google Scholar

[131] D., Guo, S., Shamai and S., Verdú, Mutual information and minimum mean-square error in Gaussian channels. IEEE Transactions on Information Theory, 51(4):1261–82, 2005.Google Scholar

[132] M., Hamdi, G., Solman, A., Kingstone and V., Krishnamurthy, Social learning in a human society: An experimental study. arXiv preprint arXiv:1408.5378, 2014.

[133] J. D., Hamilton and R., Susmel, Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64(2):307–33, 1994.Google Scholar

[134] J. E., Handschin and D. Q., Mayne, Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering. International Journal Control, 9(5):547–59, 1969.Google Scholar

[135] E. J., Hannan and M., Deistler, The Statistical Theory of Linear Systems.Wiley series in probability and mathematical statistics. Probability and mathematical statistics. (New York, NY: John Wiley, 1988).

[136] T., Hastie, R., Tibshirani and J., Friedman, The Elements of Statistical Learning. (Springer-Verlag, 2009).

[137] M., Hauskrecht, Value-function approximations for partially observable Markov decision processes. Journal of Artificial Intelligence Research, 13(1):33–94, 2000.Google Scholar

[138] S., Haykin, Cognitive radio: Brain-empowered wireless communications. IEEE Journal on Selected Areas Communications, 23(2):201–20, Feb. 2005.Google Scholar

[139] S., Haykin, Adaptive Filter Theory 5th edition.Information and System Sciences Series. (Prentice Hall, 2013).

[140] D. D., Heckathorn, Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems, 44:174–99, 1997.Google Scholar

[141] D. D., Heckathorn, Respondent-driven sampling ii: Deriving valid population estimates from chain-referral samples of hidden populations. Social Problems, 49:11–34, 2002.Google Scholar

[142] M. E., Hellman and T.M., Cover, Learning with finite memory. The Annals ofMathematical Statistics, 41(3):765–82, 1970.Google Scholar

[143] O., Hernández-Lerma and J., Bernard Laserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria. (New York, NY: Springer-Verlag, 1996).

[144] D. P., Heyman and M. J., Sobel, Stochastic Models in Operations Research, volume 2. (McGraw-Hill, 1984).

[145] N., Higham and L., Lin, On pth roots of stochastic matrices. Linear Algebra and Its Applications, 435(3):448–63, 2011.Google Scholar

[146] Y.-C., Ho and X.-R., Cao. Discrete Event Dynamic Systems and Perturbation Analysis. (Boston, MA: Kluwer Academic, 1991).

[147] J., Hofbauer and W., Sandholm, On the global convergence of stochastic fictitious play. Econometrica, 70(6):2265–94, November 2002.Google Scholar

[148] R. A., Horn and C. R., Johnson, Matrix Analysis. (Cambridge University Press, 2012).

[149] R. A., Howard, Dynamic Probabilistic Systems, volume 1: Markov Models. (New York: John Wiley, 1971).

[150] R. A., Howard, Dynamic Probabilistic Systems, volume 2: Semi-Markov and Decision Processes. (New York: John Wiley, 1971).

[151] D., Hsu, S., Kakade and T., Zhang, A spectral algorithm for learning hidden Markov models. Journal of Computer and System Sciences, 78(5):1460–80, 2012.Google Scholar

[152] S. Hsu, Chuang and A., Arapostathis, On the existence of stationary optimal policies for partially observed mdps under the long-run average cost criterion. Systems & Control Letters, 55(2):165–73, 2006.Google Scholar

[153] M., Huang and S., Dey, Stability of Kalman filtering with Markovian packet losses. Automatica, 43(4):598–607, 2007.Google Scholar

[154] Ienkaran I., Arasaratnam and S., Haykin, Cubature Kalman filters. IEEE Transactions on Automatic Control, 54(6):1254–69, 2009.Google Scholar

[155] K., Iida, Studies on the Optimal Search Plan, volume 70 of Lecture Notes in Statistics. (Springer-Verlag, 1990).

[156] M. O., Jackson. Social and Economic Networks. (Princeton, NJ: Princeton University Press, 2010).

[157] M. R., James, V., Krishnamurthy and F., LeGland, Time discretization of continuous-time filters and smoothers for HMM parameter estimation. IEEE Transactions on Information Theory, 42(2):593–605, March 1996.Google Scholar

[158] M. R., James, J. S., Baras and R. J., Elliott, Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Transactions on Automatic Control, 39(4):780–92, April 1994.Google Scholar

[159] B., Jamison, Reciprocal processes. Probability Theory and Related Fields, 30(1):65–86, 1974.Google Scholar

[160] A. H., Jazwinski, Stochastic Processes and Filtering Theory. (NJ: Academic Press, 1970).

[161] A., Jobert and L. C. G., Rogers, Valuations and dynamic convex risk measures. Mathematical Finance, 18(1):1–22, 2008.Google Scholar

[162] L., Johnston and V., Krishnamurthy, Opportunistic file transfer over a fading channel – a POMDP search theory formulation with optimal threshold policies. IEEE Transactions on Wireless Commun., 5(2):394–405, Feb. 2006.Google Scholar

[163] T., Kailath, Linear Systems. (NJ: Prentice Hall, 1980).

[164] R. E., Kalman, A new approach to linear filtering and prediction problems. Trans. ASME, Series D (J. Basic Engineering), 82:35–45, March 1960.Google Scholar

[165] R. E., Kalman, When is a linear control system optimal?J. Basic Engineering, 51–60, April 1964.Google Scholar

[166] R. E., Kalman and R. S., Bucy, New results in linear filtering and prediction theory. Trans. ASME, Series D (J. Basic Engineering), 83:95–108, March 1961.Google Scholar

[167] I., Karatzas and S., Shreve, Brownian Motion and Stochastic Calculus, 2nd edition. (Springer, 1991).

[168] S., Karlin, Total Positivity, volume 1. (Stanford Univrsity, 1968).

[169] S., Karlin and Y., Rinott, Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions. Journal of Multivariate Analysis, 10(4):467–98, December 1980.Google Scholar

[170] S., Karlin and H. M., Taylor, A Second Course in Stochastic Processes. (Academic Press, 1981).

[171] K. V., Katsikopoulos and S. E., Engelbrecht, Markov decision processes with delays and asynchronous cost collection. IEEE Transactions on Automatic Control, 48(4):568–74, 2003.Google Scholar

[172] J., Keilson and A., Kester, Monotone matrices and monotone Markov processes. Stochastic Processes and Their Applications, 5(3):231–41, 1977.Google Scholar

[173] H. K., Khalil, Nonlinear Systems 3rd edition. (Prentice Hall, 2002).

[174] M., Kijima, Markov Processes for Stochastic Modelling. (Chapman and Hall, 1997).

[175] A. N., Kolmogorov, Interpolation and extrapolation of stationary random sequences. Bull. Acad. Sci. U.S.S.R, Ser. Math., 5:3–14, 1941.Google Scholar

[176] A. N., Kolmogorov, Stationary sequences in Hilbert space. Bull. Math. Univ. Moscow, 2(6), 1941.Google Scholar

[177] L., Kontorovich and K., Ramanan, Concentration inequalities for dependent random variables via the martingale method. The Annals of Probability, 36(6):2126–58, 2008.Google Scholar

[178] V., Krishnamurthy, Algorithms for optimal scheduling and management of hidden Markov model sensors. IEEE Transactions on Signal Processing, 50(6):1382–97, June 2002.Google Scholar

[179] V., Krishnamurthy, Bayesian sequential detection with phase-distributed change time and nonlinear penalty – A lattice programming POMDP approach. IEEE Transactions on Information Theory, 57(3):7096–124, October 2011.Google Scholar

[180] V., Krishnamurthy, How to schedule measurements of a noisy Markov chain in decision making?IEEE Transactions on Information Theory, 59(9):4440–61, July 2013.Google Scholar

[181] V., Krishnamurthy and F. Vazquez, Abad, Gradient based policy optimization of constrained unichain Markov decision processes. In S., Cohen, D., Madan, and T., Siu, editors, Stochastic Processes, Finance and Control: A Festschrift in Honor of Robert J. Elliott. (World Scientific, 2012).http://arxiv.org/abs/1110.4946.

[182] V., Krishnamurthy, R., Bitmead, M., Gevers and E., Miehling, Sequential detection with mutual information stopping cost: Application in GMTI radar. IEEE Transactions on Signal Processing, 60(2):700–14, 2012.Google Scholar

[183] V., Krishnamurthy and D., Djonin, Structured threshold policies for dynamic sensor scheduling: A partially observed Markov decision process approach. IEEE Transactions on Signal Processing, 55(10):4938–57, Oct. 2007.Google Scholar

[184] V., Krishnamurthy and D.V., Djonin, Optimal threshold policies for multivariate POMDPs in radar resource management. IEEE Transactions on Signal Processing, 57(10), 2009.Google Scholar

[185] V., Krishnamurthy, O. Namvar, Gharehshiran and M., Hamdi, Interactive sensing and decision making in social networks. Foundations and Trends in Signal Processing, 7(1-2):1–196, 2014.Google Scholar

[186] V., Krishnamurthy and W., Hoiles, Online reputation and polling systems: Data incest, social learning and revealed preferences. IEEE Transactions Computational Social Systems, 1(3):164–79, January 2015.Google Scholar

[187] V., Krishnamurthy and U., Pareek, Myopic bounds for optimal policy of POMDPs: An extension of Lovejoy's structural results. Operations Research, 62(2):428–34, 2015.Google Scholar

[188] V., Krishnamurthy and H. V., Poor, Social learning and Bayesian games in multiagent signal processing: How do local and global decision makers interact?IEEE Signal Processing Magazine, 30(3):43–57, 2013.Google Scholar

[189] V., Krishnamurthy and C., Rojas, Reduced complexity HMM filtering with stochastic dominance bounds: A convex optimization approach. IEEE Transactions on Signal Processing, 62(23):6309–22, 2014.Google Scholar

[190] V., Krishnamurthy, C., Rojas and B., Wahlberg, Computing monotone policies for Markov decision processes by exploiting sparsity. In 3rd Australian Control Conference (AUCC), 1–6. IEEE, 2013.Google Scholar

[191] V., Krishnamurthy and B., Wahlberg, POMDP multiarmed bandits – structural results. Mathematics of Operations Research, 34(2):287–302, May 2009.Google Scholar

[192] V., Krishnamurthy and G., Yin, Recursive algorithms for estimation of hiddenMarkov models and autoregressive models with Markov regime. IEEE Transactions on Information Theory, 48(2):458–76, February 2002.Google Scholar

[193] P. R., Kumar and P., Varaiya, Stochastic Systems – Estimation, Identification and Adaptive Control. (Prentice Hall, 1986).

[194] H., Kurniawati, D., Hsu and W. S., Lee, Sarsop: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In 2008 Robotics: Science and Systems Conference, Zurich, Switzerland, 2008.

[195] T. G., Kurtz, Approximation of Population Processes, volume 36. SIAM, 1981.

[196] H. J., Kushner, Dynamical equations for optimal nonlinear filtering. Journal of Differential Equations, 3:179–90, 1967.Google Scholar

[197] H. J., Kushner, A robust discrete state approximation to the optimal nonlinear filter for a diffusion. Stochastics, 3(2):75–83, 1979.Google Scholar

[198] H. J., Kushner, Approximation and Weak Convergence Methods for Random Processes, with Applications to Stochastic Systems Theory. (Cambridge, MA: MIT Press, 1984).

[199] H. J., Kushner and D. S., Clark, Stochastic Approximation Methods for Constrained and Unconstrained Systems. (Springer-Verlag, 1978).

[200] H. J., Kushner and G., Yin, Stochastic Approximation Algorithms and Recursive Algorithms and Applications, 2nd edition. (Springer-Verlag, 2003).

[201] T., Lai and H., Robbins, Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1):4–22, 1985.Google Scholar

[202] A., Lansky, A., Abdul-Quader, M., Cribbin, T., Hall, T. J., Finlayson, R., Garffin, L. S., Lin and P., Sullivan, Developing an HIV behavioral surveillance system for injecting drug users: the National HIV Behavioral Surveillance System.Public Health Reports, 122(S1):48–55, 2007.Google Scholar

[203] S., Lee, Understanding respondent driven sampling from a total survey error perspective.Survey Practice, 2(6), 2009.Google Scholar

[204] F., LeGland and L., Mevel, Exponential forgetting and geometric ergodicity in hidden Markov models.Mathematics of Controls, Signals and Systems, 13(1):63–93, 2000.Google Scholar

[205] B. G., Leroux, Maximum-likelihood estimation for hidden Markov models.Stochastic Processes and Its Applications, 40:127–43, 1992.Google Scholar

[206] R., Levine and G., Casella, Implementations of the Monte Carlo EM algorithm.Journal of Computational and Graphical Statistics, 10(3):422–39, September 2001.Google Scholar

[207] T., Lindvall. Lectures on the Coupling Method. (Courier Dover Publications, 2002).

[208] M., Littman, A. R., Cassandra and L., Kaelbling, Learning policies for partially observable environments: Scaling up.In ICML, volume 95, pages 362–70. Citeseer, 1995.Google Scholar

[209] M. L., Littman, Algorithms for Sequential Decision Making. PhD thesis, Brown University, 1996.

[210] M. L., Littman, A tutorial on partially observable Markov decision processes.Journal of Mathematical Psychology, 53(3):119–25, 2009.Google Scholar

[211] C., Liu and D.B., Rubin, The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence.Biometrica, 81(4):633–48, 1994.Google Scholar

[212] J. S., Liu, Monte Carlo Strategies in Scientific Computing. (Springer-Verlag, 2001).

[213] J. S., Liu and R., Chen, Sequential monte carlo methods for dynamic systems.Journal American Statistical Association, 93:1032–44, 1998.Google Scholar

[214] K., Liu and Q., Zhao, Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access.IEEE Transactions on Information Theory, 56(11):5547–67, 2010.Google Scholar

[215] Z., Liu and L., Vandenberghe, Interior-point method for nuclear norm approximation with application to system identification.SIAM Journal on Matrix Analysis and Applications, 31(3):1235–56, 2009.Google Scholar

[216] L., Ljung, Analysis of recursive stochastic algorithms.IEEE Transactions on Auto. Control, AC-22(4):551–75, 1977.Google Scholar

[217] L., Ljung, System Identification, 2nd edition. (Prentice Hall, 1999).

[218] L., Ljung and T., Söderström, Theory and Practice of Recursive Identification. (Cambridge, MA: MIT Press, 1983).

[219] I., Lobel, D., Acemoglu, M., Dahleh and A. E., Ozdaglar, Preliminary results on social learning with partial observations.In Proceedings of the 2nd International Conference on Performance Evaluation Methodolgies and Tools. (Nantes, France, 2007). ACM.

[220] A., Logothetis and A., Isaksson. On sensor scheduling via information theoretic criteria.In Proc. American Control Conf., pages 2402–06, (San Diego, 1999).

[221] D., López-Pintado, Diffusion in complex social networks.Games and Economic Behavior, 62(2):573–90, 2008.Google Scholar

[222] T. A., Louis, Finding the observed information matrix when using the EM algorithm.Journal of the Royal Statistical Society, 44(B):226–33, 1982.Google Scholar

[223] W. S., Lovejoy, On the convexity of policy regions in partially observed systems. Operations Research, 35(4):619–21, July–August 1987.

[224] W. S., Lovejoy, Ordered solutions for dynamic programs.Mathematics of Operations Research, 12(2):269–76, 1987.Google Scholar

[225] W. S., Lovejoy, Some monotonicity results for partially observed Markov decision processes.Operations Research, 35(5):736–43, September–October 1987.Google Scholar

[226] W. S., Lovejoy, Computationally feasible bounds for partially observed Markov decision processes.Operations Research, 39(1):162–75, January–February 1991.Google Scholar

[227] W. S., Lovejoy, A survey of algorithmic methods for partially observed Markov decision processes.Annals of Operations Research, 28:47–66, 1991.Google Scholar

[228] M., Luca, Reviews, Reputation, and Revenue: The Case of Yelp.com, Technical Report 12- 016. Harvard Business School, September 2011.

[229] D. G., Luenberger, Optimization by Vector Space Methods. (New York, NY: John Wiley, 1969).

[230] I., MacPhee and B., Jordan, Optimal search for a moving target.Probability in the Engineering and Information Sciences, 9:159–82, 1995.Google Scholar

[231] C. D., Manning and H., Schütze, Foundations of Statistical Natural Language Processing. (Cambridge, MA: The MIT Press, 1999).

[232] S. I., Marcus, Algebraic and geometric methods in nonlinear filtering.SIAM Journal on Control and Optimization, 22(6):817–44, November 1984.Google Scholar

[233] S. I., Marcus and A. S., Willsky, Algebraic structure and finite dimensional nonlinear estimation.SIAM J. Math. Anal., 9(2):312–27, April 1978.Google Scholar

[234] H., Markowitz, Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.

[235] D. Q., Mayne, J. B., Rawlings, C. V., Rao and P., Scokaert, Constrained model predictive control: Stability and optimality.Automatica, 36(6):789–814, 2000.Google Scholar

[236] G. J., McLachlan and T., Krishnan, The EM Algorithm and Extensions.Wiley series in probability and statistics. Applied probability and statistics. (New York, NY: John Wiley, 1996).

[237] L., Meier, J., Perschon and R. M., Dressler, Optimal control of measurement subsystems.IEEE Transactions on Automatic Control, 12(5):528–36, October 1967.Google Scholar

[238] J. M., Mendel, Maximum-Likelihood Deconvolution: A Journey into Model-Based Signal Processing. (Springer-Verlag, 1990).

[239] X. L., Meng, On the rate of convergence of the ecm algorithm.The Annals of Statistics, 22(1):326–39, 1994.Google Scholar

[240] S. P., Meyn and R. L., Tweedie, Markov Chains and Stochastic Stability. (Cambridge University Press, 2009).

[241] P., ilgrom, Good news and bad news: Representation theorems and applications.Bell Journal of Economics, 12(2):380–91, 1981.Google Scholar

[242] P., Milgrom and C., Shannon, Monotone comparative statics.Econometrica, 62(1):157–180, 1994.Google Scholar

[243] R. R., Mohler and C. S., Hwang, Nonlinear data observability and information.Journal of Franklin Institute, 325(4):443–64, 1988.Google Scholar

[244] G. E., Monahan, A survey of partially observable Markov decision processes: Theory, models and algorithms.Management Science, 28(1), January 1982.Google Scholar

[245] P., Del Moral, Feynman-Kac Formulae – Genealogical and Interacting Particle Systems with Applications. (Springer-Verlag, 2004).

[246] W., Moran, S., Suvorova and S., Howard, Application of sensor scheduling concepts to radar. In A., Hero, D., Castanon, D., Cochran and K., Kastella, editors, Foundations and Applications for Sensor Management, pages 221–56. (Springer-Verlag, 2006).

[247] G. B., Moustakides, Optimal stopping times for detecting changes in distributions.Annals of Statistics, 14:1379–87, 1986.Google Scholar

[248] A., Muller, How does the value function of a Markov decision process depend on the transition probabilities?Mathematics of Operations Research, 22:872–85, 1997.Google Scholar

[249] A., Muller and D., Stoyan, Comparison Methods for Stochastic Models and Risk. (Wiley, 2002).

[250] M. F., Neuts, Structured Stochastic Matrices of M/G/1 Type and Their Applications. (Marcel Dekker, 1989).

[251] A., Ng and M., Jordan, Pegasus: A policy search method for large MDPs and POMDPs.In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, pages 406–15. (Morgan Kaufmann Publishers Inc., 2000).

[252] M. H., Ngo and V., Krishnamurthy, Optimality of threshold policies for transmission scheduling in correlated fading channels.IEEE Transactions on Communications, 57(8):2474–83, 2009.Google Scholar

[253] M. H., Ngo and V., Krishnamurthy, Monotonicity of constrained optimal transmission policies in correlated fading channels with ARQ.IEEE Transactions on Signal Processing, 58(1):438–51, 2010.Google Scholar

[254] N., Noels, C., Herzet, A., Dejonghe, V., Lottici, H., Steendam, M., Moeneclaey, M., Luise and L., Vandendorpe, Turbo synchronization: an EM algorithm interpretation.In Proceedings of IEEE International Conference on Communications ICC'03, volume 4, 2933–7. IEEE, 2003.Google Scholar

[255] M., Ottaviani and P., Sørensen, Information aggregation in debate: Who should speak first?Journal of Public Economics, 81(3):393–421, 2001.Google Scholar

[256] C. H., Papadimitriou and J. N., Tsitsiklis, The complexity of Markov decision processes.Mathematics of Operations Research, 12(3):441–50, 1987.Google Scholar

[257] E., Pardoux, Equations du filtrage nonlineaire de la prediction et du lissage.Stochastics, 6:193–231, 1982.Google Scholar

[258] R., Parr and S., Russell, Approximating optimal policies for partially observable stochastic domains.In IJCAI, volume 95, pages 1088–94. (Citeseer, 1995).Google Scholar

[259] R., Pastor-Satorras and A., Vespignani, Epidemic spreading in scale-free networks.Physical Review Letters, 86(14):3200, 2001.Google Scholar

[260] S., Patek, On partially observed stochastic shortest path problems. In Proceedings of 40th IEEE Conference on Decision and Control, pages 5050–5, Orlando, Florida, 2001.

[261] G., Pflug, Optimization of Stochastic Models: The Interface between Simulation and Optimization.Kluwer Academic Publishers, 1996.

[262] J., Pineau, G., Gordon and T., Sebastian, Point-based value iteration: An anytime algorithm for POMDPs.In IJCAI, volume 3, 1025–32, 2003.Google Scholar

[263] M. L., Pinedo, Scheduling: Theory, Algorithms, and Systems. (Springer-Verlag, 2012).

[264] L. K., Platzman, Optimal infinite-horizon undiscounted control of finite probabilistic systems.SIAM Journal on Control and Optimization, 18:362–80, 1980.Google Scholar

[265] S. M., Pollock, A simple model of search for a moving target.Operations Research, 18:893–903, 1970.Google Scholar

[266] B. T., Polyak and A. B., Juditsky, Acceleration of stochastic approximation by averaging.SIAM Journal of Control and Optimization, 30(4):838–55, July 1992.Google Scholar

[267] H. V., Poor, Quickest detection with exponential penalty for delay.Annals of Statistics, 26(6):2179–205, 1998.Google Scholar

[268] H. V., Poor and O., Hadjiliadis, Quickest Detection. (Cambridge University Press, 2008).

[269] H. V., Poor, An Introduction to Signal Detection and Estimation, 2nd edition. (Springer-Verlag, 1993).

[270] B. M., Pötscher and I. R., Prucha, Dynamic Nonlinear Econometric Models: Asymptotic Theory. (Springer-Verlag, 1997).

[271] K., Premkumar, A., Kumar and V. V., Veeravalli, Bayesian Quickest Transient Change Detection.In Proceedings of InternationalWorkshop in Applied Probability,Madrid, 2010.Google Scholar

[272] M., Puterman, Markov Decision Processes. (John Wiley, 1994).

[273] J., Quah and B., Strulovici, Aggregating the single crossing property.Econometrica, 80(5):2333–48, 2012.Google Scholar

[274] L. R., Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition.Proceedings of the IEEE, 77(2):257–85, 1989.Google Scholar

[275] V., Raghavan and V., Veeravalli, Bayesian quickest change process detection.In ISIT, 644–648, Seoul, 2009.Google Scholar

[276] F., Riedel, Dynamic coherent risk measures.Stochastic Processes and Their Applications, 112(2):185–200, 2004.Google Scholar

[277] U., Rieder, Structural results for partially observed control models.Methods and Models of Operations Research, 35(6):473–90, 1991.Google Scholar

[278] U., Rieder and R., Zagst, Monotonicity and bounds for convex stochastic control models.Mathematical Methods of Operations Research, 39(2):187–207, June 1994.Google Scholar

[279] B., Ristic, S., Arulampalam and N., Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications. (Artech, 2004).

[280] C. P., Robert and G., Casella, Monte Carlo Statistical Methods. (Springer-Verlag, 2013).

[281] R. T., Rockafellar and S., Uryasev, Optimization of conditional value-at-risk.Journal of Risk, 2:21–42, 2000.Google Scholar

[282] S., Ross, Arbitrary state Markovian decision processes.The Annals of Mathematical Statistics, 2118–22, 1968.Google Scholar

[283] S., Ross, Introduction to Stochastic Dynamic Programming. (San Diego, CA: Academic Press, 1983).

[284] S., Ross, Simulation, 5th edition. (Academic Press, 2013).

[285] D., Rothschild and J., Wolfers, Forecasting elections: Voter intentions versus expectations, 2010.

[286] N., Roy, G., Gordon and S., Thrun, Finding approximate POMDP solutions through belief compression.Journal of Artificial Intelligence Research, 23:1–40, 2005.Google Scholar

[287] W., Rudin, Principles of Mathematical Analysis. (McGraw-Hill, 1976).

[288] A., Ruszczyński, Risk-averse dynamic programming for Markov decision processes.Mathematical Programming, 125(2):235–61, 2010.Google Scholar

[289] T., Sakaki, M., Okazaki and Y., Matsuo, Earthquake shakes Twitter users: Real-time event detection by social sensors.In Proceedings of the 19th International Conference on World Wide Web, pages 851–60. (New York, 2010). ACM.

[290] A., Sayed, Adaptive Filters. (Wiley, 2008).

[291] A. H., Sayed, Adaptation, learning, and optimization over networks.Foundations and Trends in Machine Learning, 7(4–5):311–801, 2014.Google Scholar

[292] M., Segal and E., Weinstein, A new method for evaluating the log-likelihood gradient, the hessian, and the Fisher information matrix for linear dynamic systems.IEEE Transactions on Information Theory, 35(3):682–7, May 1989.Google Scholar

[293] E., Seneta, Non-Negative Matrices and Markov Chains. (Springer-Verlag, 1981).

[294] L. I., Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems. (Wiley, 1999).

[295] M., Shaked and J. G., Shanthikumar, Stochastic Orders. (Springer-Verlag, 2007).

[296] G., Shani, R., Brafman and S., Shimony, Forward search value iteration for POMDPs.In IJCAI, 2619–24, 2007.Google Scholar

[297] G., Shani, J., Pineau and R., Kaplow, A survey of point-based POMDP solvers.Autonomous Agents and Multi-Agent Systems, 27(1):1–51, 2013.Google Scholar

[298] A. N., Shiryaev, On optimum methods in quickest detection problems.Theory of Probability and Its Applications, 8(1):22–46, 1963.Google Scholar

[299] R. H., Shumway and D. S., Stoffer, An approach to time series smoothing and forecasting using the EM algorithm.Journal of Time Series Analysis, 253–64, 1982.Google Scholar

[300] R., Simmons and S., Konig, Probabilistic navigation in partially observable environments.In Proceedings of 14th International Joint Conference on Artificial Intelligence, 1080–87, (Montreal, CA: Morgan Kaufman).

[301] S., Singh and V., Krishnamurthy, The optimal search for a Markovian target when the search path is constrained: the infinite horizon case.IEEE Transactions on Automatic Control, 48(3):487–92, March 2003.Google Scholar

[302] R. D., Smallwood and E. J., Sondik, Optimal control of partially observable Markov processes over a finite horizon.Operations Research, 21:1071–88, 1973.Google Scholar

[303] J. E., Smith and K. F., McCardle, Structural properties of stochastic dynamic programs.Operations Research, 50(5):796–809, 2002.Google Scholar

[304] T., Smith and R., Simmons, Heuristic search value iteration for pomdps.In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, 520–7. (AUAI Press, 2004).

[305] V., Solo and X., Kong, Adaptive Signal Processing Algorithms – Stability and Performance. (NJ: Prentice Hall, 1995).

[306] E. J., Sondik, The Optimal Control of Partially Observed Markov Processes. PhD thesis, Electrical Engineering, Stanford University, 1971.

[307] E. J., Sondik, The optimal control of partially observableMarkov processes over the infinite horizon: discounted costs.Operations Research, 26(2):282–304, March–April 1978.Google Scholar

[308] M., Spaan and N., Vlassis, Perseus: Randomized point-based value iteration for POMDPs.J. Artif. Intell. Res.(JAIR), 24:195–220, 2005.Google Scholar

[309] J., Spall, Introduction to Stochastic Search and Optimization. (Wiley, 2003).

[310] L., Stone, What's happened in search theory since the 1975 Lanchester prize.Operations Research, 37(3):501–06, May–June 1989.Google Scholar

[311] R. L., Stratonovich, Conditional Markov processes.Theory of Probability and Its Applications, 5(2):156–78, 1960.Google Scholar

[312] J., Surowiecki, The Wisdom of Crowds. (New York, NY: Anchor, 2005).

[313] R., Sutton and A., Barto, Reinforcement Learning: An Introduction. (Cambridge, MA: MIT Press, 1998).

[314] M., Taesup and T., Weissman, Universal Filtering Via Hidden Markov Modeling.IEEE Transactions on Information Theory, 54(2):692–708, 2008.Google Scholar

[315] M. A., Tanner, Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions.Springer series in statistics. (New York, NY: Springer-Verlag, 1993).

[316] M. A., Tanner and W. A., Wong, The calculation of posterior distributions by data augmentation.J. Am. Statis. Assoc., 82:528–40, 1987.Google Scholar

[317] A. G., Tartakovsky and V. V., Veeravalli, General asymptotic Bayesian theory of quickest change detection.Theory of Probability and Its Applications, 49(3):458–97, 2005.Google Scholar

[318] R., Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.Google Scholar

[319] P., Tichavsky, C. H., Muravchik and A., Nehorai, Posterior Cramér-Rao bounds for discretetime nonlinear filtering.IEEE Transactions on Signal Processing, 46(5):1386–96, May 1998.Google Scholar

[320] L., Tierney, Markov chains for exploring posterior distributions.The Annals of Statistics, 1701–28, 1994.Google Scholar

[321] D. M., Topkis, Minimizing a submodular function on a lattice.Operations Research, 26:305–21, 1978.Google Scholar

[322] D. M., Topkis, Supermodularity and Complementarity. (Princeton, NJ: Princeton University Press, 1998).

[323] D. van, Dyk and X., Meng, The art of data augmentation.Journal of Computational and Graphical Statistics, 10(1):1–50, 2001.Google Scholar

[324] L., Vandenberghe and S., Boyd, Semidefinite programming.SIAM review, 38(1):49–95, 1996.Google Scholar

[325] V. N., Vapnik, Statistical Learning Theory. (Wiley, 1998).

[326] H., Varian, The nonparametric approach to demand analysis.Econometrica, 50(1):945–73, 1982.Google Scholar

[327] H., Varian, Non-parametric tests of consumer behaviour.The Review of Economic Studies, 50(1):99–110, 1983.Google Scholar

[328] H., Varian, Revealed preference and its applications.The Economic Journal, 122(560):332–8, 2012.Google Scholar

[329] F., Vega-Redondo, Complex Social Networks, volume 44. (Cambridge University Press, 2007).

[330] S., Verdu, Multiuser Detection. (Cambridge University Press, 1998).

[331] B., Wahlberg, S., Boyd, M., Annergren and Y., Wang, An ADMM algorithm for a class of total variation regularized estimation problems. In Proceedings 16th IFAC Symposium on System Identification, July 2012.

[332] A., Wald, Note on the consistency of the maximum likelihood estimate.The Annals of Mathematical Statistics, 595–601, 1949.Google Scholar

[333] E., Wan and R. Van Der, Merwe, The unscented Kalman filter for nonlinear estimation.In Adaptive Systems for Signal Processing, Communications, and Control Symposium 2000. AS-SPCC. The IEEE 2000, pages 153–8. IEEE, 2000.

[334] C. C., White and D. P., Harrington, Application of Jensen's inequality to adaptive suboptimal design.Journal of Optimization Theory and Applications, 32(1):89–99, 1980.Google Scholar

[335] L. B., White and H. X., Vu, Maximum likelihood sequence estimation for hidden reciprocal processes.IEEE Transactions on Automatic Control, 58(10):2670–74, 2013.Google Scholar

[336] W., Whitt, Multivariate monotone likelihood ratio and uniform conditional stochastic order.Journal Applied Probability, 19:695–701, 1982.Google Scholar

[337] P., Whittle, Multi-armed bandits and the Gittins index.J. R. Statist. Soc. B, 42(2):143–9, 1980.Google Scholar

[338] N., Wiener, The Extrapolation, Interpolation and Smoothing of Stationary Time Series. (New York, NY: John Wiley, 1949).

[339] J., Williams, J., Fisher, and A., Willsky, Approximate dynamic programming for communication-constrained sensor network management.IEEE Transactions on Signal Processing, 55(8):4300–11, 2007.Google Scholar

[340] E., Wong and B., Hajek. Stochastic Processes in Engineering Systems, 2nd edition. (Berlin: Springer-Verlag, 1985).

[341] W. M., Wonham, Some applications of stochastic differential equations to optimal nonlinear filtering.SIAM J. Control, 2(3):347–69, 1965.Google Scholar

[342] C. F. J., Wu, On the convergence properties of the EM algorithm.Annals of Statistics, 11(1):95–103, 1983.Google Scholar

[343] J., Xie, S., Sreenivasan, G., Kornis, W., Zhang, C., Lim and B., Szymanski, Social consensus through the influence of committed minorities.Physical Review E, 84(1):011130, 2011.Google Scholar

[344] B., Yakir, A. M., Krieger and M., Pollak, Detecting a change in regression: First-order optimality.Annals of Statistics, 27(6):1896–1913, 1999.Google Scholar

[345] D., Yao and P., Glasserman, Monotone Structure in Discete-Event Systems. (Wiley, 1st edition, 1994).

[346] G., Yin, C., Ion and V., Krishnamurthy, How does a stochastic optimization/approximation algorithm adapt to a randomly evolving optimum/root with jump Markov sample paths.Mathematical Programming, 120(1):67–99, 2009.Google Scholar

[347] G., Yin and V., Krishnamurthy, LMS algorithms for tracking slowMarkov chains with applications to hidden Markov estimation and adaptive multiuser detection.IEEE Transactions on Information Theory, 51(7), July 2005.Google Scholar

[348] G., Yin, V., Krishnamurthy and C., Ion, Regime switching stochastic approximation algorithms with application to adaptive discrete stochastic optimization.SIAM Journal on Optimization, 14(4):117–1215, 2004.Google Scholar

[349] G., Yin and Q., Zhang, Discrete-time Markov Chains: Two-Time-Scale Methods and Applications, volume 55. (Springer, 2006).

[350] S., Young, M., Gasic, B., Thomson and J., Williams, POMDP-based statistical spoken dialog systems: A review.Proceedings of the IEEE, 101(5):1160–79, 2013.Google Scholar

[351] F., Yu and V., Krishnamurthy, Optimal joint session admission control in integrated WLAN and CDMA cellular network.IEEE Transactions Mobile Computing, 6(1):126–39, January 2007.Google Scholar

[352] M., Zakai, On the optimal filtering of diffusion processes.Z. Wahrscheinlichkeitstheorie verw. Gebiete, 11:230–43, 1969.Google Scholar

[353] Q., Zhao, L., Tong, A., Swami and Y., Chen, Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework.IEEE Journal on Selected Areas Communications, pages 589–600, 2007.Google Scholar

[354] K., Zhou, J., Doyle and K., Glover, Robust and Optimal Control, volume 40. (NJ: Prentice Hall, 1996).

Book contents

References

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive