I can't convey enough what a huge gulf there is between a simple experiment that looks like it works, and production-grade software that scales to real games in a range of real cases. Sure, you can upload a texture every frame, but this nukes performance on some devices (especially low-end Android). It's similar to the text-object-changing-every-frame performance hit that some people run in to, which uses a similar approach of texture uploads - however animated SVGs are likely to be significantly worse, since they would likely be used with many more instances and at larger sizes. Synchronous texture uploads can also jank the game; asynchronous ones have complex scheduling issues and can introduce a visible delay. Simply dumping everything in to a texture per-instance will end up using a colossal amount of memory if you have lots of instances showing differently - probably enough to crash the game on some devices.
When I say it would be very difficult, it's not a question of "can it be done?" - as with most things it's easy to whip up a basic demo that shows something working. The real question is "will it work well once scaled up to the demanding requirements of real-world games?"