mapping out the visual language of film using a multimodal llm:
i fed frames of a short film to a vision-language model and mapped out its ratings of surrealism and presence of human figure in each moment along the timeline. the result is an interactive playback interface based on these 2 dimensions: