Statistical Machine Learning:
Fundamental Limits and Applications of Reinforcement Learning (Single and Multi-Agent):
Reinforcement Learning is about taking suitable action to maximize reward in a particular situation. It is employed by various software and machines to find the best possible behavior or path it should take in a specific situation. Finding near-optimal algorithms for taking actions in reinforcement learning in different settings, and their applications in real world problems is the focus of our research. The different type of problems that we have considered include (i) Combinatorial bandits problem where the regret bounds are provided with low space-time complexity algorithm, and the approach has been applied to social networking applications. (ii) State in reinforcement learning is known in many applications after a delay, due to communication and other bottlenecks, and efficient algorithms are essential. (iii) Efficient algorithms for multiple competing agents to maximize their rewards are essential since most applications have competing agents. These works have multiple applications in social networks, transportation, cloud computing, video streaming. The figure alongside depicts the improved performance of proposed strategy, DeepPool, for ride-sharing. The number of customers accepted are higher for same number of vehicles used and ride-sharing improves the costs, travel times, and number of customers served.
- M. Agarwal and V. Aggarwal, "Non-Linear Reinforcement Learning: A Non-Markovian Approach," To be submitted.
- M. Agarwal and V. Aggarwal, "Blind Decision Making: Reinforcement Learning with Delayed Observations," Submitted Mar 2019.
- A. Ghosh and V. Aggarwal, "Reinforcement Learning for Mean Field Game," Submitted Feb 2019.
- A. Singh, A. Alabbasi, and V. Aggarwal, "A Reinforcement Learning Based Algorithm for Multi-hop Ride-sharing: Model-free Approach," in Proc. Neurips Workshop, Dec 2019.
- Y. Wang, Y. Li, T. Lan, and V. Aggarwal, "DeepChunk: Deep Q-Learning for Chunk-based Caching in Data Processing Networks," Accepted to IEEE Transactions on Cognitive Communications and Networking, Special Issue on Deep Reinforcement Learning for Future Wireless Communication Networks, Oct 2019.
- Ramkumar Raghu, Pratheek Upadhyaya, Mahadesh Panju, Vaneet Aggarwal, and Vinod Sharma, "Deep Reinforcement Learning Based Power control for Wireless Multicast Systems," in Proc. Allerton, Oct 2019.
- Y. Wang, Y. Li, T. Lan, and V. Aggarwal"DeepChunk: Deep Q-Learning for Chunk-based Caching in Data Processing Networks," in Proc. Allerton, Oct 2019.
- M. Agarwal and V. Aggarwal, "Regret Bounds for Stochastic Combinatorial Multi-Armed Bandits with Linear Space Complexity," Submitted Nov 2018.
- A. Al-Abassi, A. Ghosh, and V. Aggarwal, "DeepPool: Distributed Model-free Algorithm for Ride-sharing using Deep Reinforcement Learning," Accepted to IEEE Transactions on Intelligent Transportation Systems, Jul 2019.
Fundamental Limits on Subspace Clustering and Data Completion:
Tensors are generalizations of vectors and matrices; a vector is a first-order tensor and a matrix is a second-order tensor. Most of the data around us are better represented with multiple orders to capture the correlations across different attributes. For example, a color image can be considered as a third-order tensor, two of the dimensions (rows and columns) being spatial, and the third being spectral (color), while a color video sequence can be considered as an order four tensor, time being the fourth dimension besides spatial and spectral. Similarly, a colored 3-D MRI image across time can be considered as an order five tensor. Exploiting additional structure leads to better embedding algorithms for subspace analysis and the elements needed for data completion (as shown alongside).
- M. Ashraphijuo, X. Wang, and V. Aggarwal, "Deterministic and Probabilistic Conditions for Finite Completability of Low-rank Multi-View Data," Submitted, Jul 2017.
- M. Ashraphijuo, V. Aggarwal, and X. Wang, "Deterministic and Probabilistic Conditions for Finite Completability of Low-Tucker-Rank Tensor," Submitted to IEEE Transactions on Information Theory, Dec 2016.
- W. Wang, V. Aggarwal, and S. Aeron, "Tensor Train Neighborhood Preserving Embedding," IEEE Transactions on Signal Processing, vol. 66, no. 10, pp. 2724-2732, May, 2018.
- M. Ashraphijuo, V. Aggarwal, and X. Wang, "On Deterministic Sampling Patterns for Robust Low-Rank Matrix Completion," IEEE Signal Processing Letters, vol. 25. no. 3, pp. 343-347, Mar 2018.
- M. Ashraphijuo, X. Wang, and V. Aggarwal, "Rank Determination for Low-Rank Data Completion," Journal of Machine Learning Research, vol. 18(98), pp. 1-29, Sept 2017.
- M. Ashraphijuo, X. Wang, and V. Aggarwal,"An approximation of the CP-rank of a partially sampled tensor," in Proc. Allerton, Oct 2017.
- M. Ashraphijuo, V. Aggarwal, and X. Wang, "A characterization of sampling patterns for low-tucker-rank tensor completion problem," in Proc. IEEE ISIT, Jun 2017.
- M. Ashraphijuo, X. Wang, and V. Aggarwal,"A characterization of sampling patterns for low-rank multi-view data completion problem," in Proc. IEEE ISIT, Jun 2017.
- V. Aggarwal and S. Aeron, "A note on Information-theoretic Bounds on Matrix Completion under Union of Subspaces Model," in Proc. Allerton, Sept 2016.
- W.Wang, S. Aeron, and V. Aggarwal, "On deterministic conditions for subspace clustering under missing data," in Proc. IEEE ISIT, 2016.
Applications of Exploiting Data Structure in Matrix/Tensor for Better Sampling, Fingerprinting, and Estimation:
Exploiting structure in the data across multiple dimensions can lead to efficient techniques for efficient sampling and fingerprinting. The figure alongside shows different network service atributes dependence on time, spatial locations and data service quality measurements thus motivating tensor structure. Further, the map shows the received signal strength in indoor has spatial dependence and the signals from different APs are correlated and thus adaptive sampling can reduce fingerprints needed for localization.
- X. Liu, S. Aeron, V. Aggarwal, and X. Wang, "Low-tubal-rank Tensor Completion using Alternating Minimization," Submitted to IEEE Transactions on Information Theory, Oct 2016.
- W. Wang, Y. Sun, B. Eriksson, W. Wang, and V. Aggarwal, "Wide Compression: Tensor Ring Nets," in Proc. CVPR, Jun 2018
- W.Wang, V. Aggarwal, and S. Aeron, "Efficient Low Rank Tensor Ring Completion," in Proc. ICCV, Oct 2017.
- W.Wang, V. Aggarwal, and S. Aeron, "Unsupervised Clustering Under The Union of Polyhedral Cones (UOPC) Model," Pattern Recognition Letters, vol. 100, pp. 104-109, Dec 2017.
- V. Aggarwal, A. A. Mahimkar, H. Ma, Z. Zhang, S. Aeron, and W. Willinger, "Inferring Smartphone Service Quality using Tensor Methods," in Proc. 12th International Conference on Network and Service Management Oct-Nov, 2016.
- X. Liu, S. Aeron, V. Aggarwal, X. Wang, and M. Wu, "Adaptive Sampling of RF fingerprints for Fine-grained Indoor Localization," IEEE Transactions on Mobile Computing, vol. 15, no. 10, pp. 2411-2423, Oct. 2016.
- X. Liu, S. Aeron, V. Aggarwal, and X. Wang, "Low-tubal-rank Tensor Completion using Alternating Minimization," in Proc. SPIE Defense + Security, 2016.
- X. Liu, S. Aeron, V. Aggarwal, X. Wang, and M. Wu, "Tensor completion via adaptive sampling of tensor fibers: An application to efficient RF fingerprinting," in Proc. IEEE ICASSP, Mar 2016.
- Exploiting Network Structure for Better Scheduling Strategies:
Learning the network characteristics is crucial for coming up with next generation network designs. Getting the measurements of the received power from a tower leads to path loss which helps in design of efficient codes. Fingerprinting roads can help find efficient scheduling/handoff decisions for a service provider. Learning attributes at the network helps learn interaction between different layers to achieve better designs. The distribution of interference can lead to better interference management techniques. This understanding will be useful to achieve new and better system designs. The figure alongside depicts a mobile user trajectory along a road through 3 cellular sectors showing interactions with the cellular network and the measured values of channel quality (Ec/Io) during 3 different drives. This structure has been used to novel scheduling strategies with prediction.
- R. Margolies, A. Sridharan, V. Aggarwal, R. Jana, N. K. Shankaranayanan, V. Vaishampayan, and G. Zussman, "Exploiting Mobility in Proportional Fair Cellular Scheduling: Measurements and Algorithms," IEEE/ACM Transactions on Networking, vol. 24, no. 1, pp. 355-367, Feb. 2016.
- R. Margolies, A. Sridharan, V. Aggarwal, R. Jana, N. K. Shankaranayanan, V. Vaishampayan, and G. Zussman, "Exploiting Mobility in Proportional Fair Cellular Scheduling: Measurements and Algorithms," in Proc. Infocomm 2014.
- V. Aggarwal, R. Jana, J. Pang, K. K. Ramakrishnan and N. K. Shankaranarayanan, "Characterizing Fairness for 3G Wireless Networks," in Proc. LANMAN, Oct. 2011.
- V. Aggarwal and S. Krishnan, "Achieving Approximate Soft Clustering in Data Streams," 2011. Implementation
- Learning Structures in Data:
Video and web applications are the most prominent applications in today's world. It is important to understand how radio network characteristics (such as signal strength, handovers, load, etc.) influence users' Quality-of-Experience (QoE). Understanding the relationship between QoE and network characteristics is a pre-requisite for cellular network operators to detect when and where degraded network conditions actually impact QoE. Unfortunately, cellular network operators do not have access to detailed server-side or client-side logs to directly measure QoE metrics, such as abandonment rate, stalls, and video/session length. We devise a machine-learning-based mechanism to infer web QoE metrics from network traces accurately. The image alongside depicts partial web downloads as a function of network attributes. Further, heavy tails in work loads (file sizes, flow lengths, service times, etc.) have significant negative impact on the performance of queues and networks. There has been attempts to model work loads in head or tail of the distribution. We find that logPH distribution models both the head and the tail, since these distributions have a power law tail and can approximate any distribution arbitrarily closely not just in the tail but in its entire range. The figure below compares the logPH fit of the famous Internet file size data of Crovella.
- A. Balachandran, V. Aggarwal, E. Halepovic, J. Pang, S. Seshan, S. Venkataraman, and H. Yan "Modeling Web Quality-of-Experience on Cellular Networks," in Proc. Mobicom 2014
- V. Aggarwal, E. Halepovic, J. Pang, S. Venkataraman, and H. Yan "Prometheus: Toward Quality-of-Experience Estimation for Mobile Apps from Passive Network Measurements," in Proc. Hotmobile 2014
- V. Ramaswami, K. Jain, R. Jana, and V. Aggarwal, "Modeling Heavy-tails in Traffic Sources for Network Performance Evaluation," in Proc. International Conference on Computational Intelligence, Cyber Security and Computational Models, Dec. 2013.