Abstract: Long-form video understanding is complicated by the high redundancy of video data and the abundance of query-irrelevant information. To tackle these challenges, we propose VideoTree, a ...
Abstract: As a very common type of video, face videos often appear in movies, talk shows, live broadcasts, and other scenes. Real-World online videos are often plagued by degradations such as blurring ...