The only problem with your idea that I see is this requires the lightweight node to download, even if it doesn't store, all transactions.
That was how Bitcoin first lightweight node, bitcoinj, used to behave. That started to become a problem for mobile nodes, as the amount of data to download was getting bigger. The solution was the usage of bloom filters, allowing the lightweight nodes to only download the data that concerned them. You can set a high rate of false positive to bloom filters, and you can request different subsets of your keys to different peers, in order to protect your privacy. Unfortunately, for this solution to be applicable to Monero, full nodes would have to be able to serve bloom filter requests.
Actually, the requesting different subsets of keys would also be important when building the mixin set. I think it would be better to renew the set of keys to be used at each transaction, in order to never reuse two keys given by the same peer for the same request. Don't keep connections open in between different requests/transactions and hide your IP behind Tor or I2P before connecting to full nodes. All these measures are to prevent full nodes from breaking your unlinkability.
Edit: I'm thinking this through again. Keeping Monero's privacy features for a lightweight node is not an easy thing. The reason I'm saying this is that I believe that, as the pool of available keys for mixin increases, any response for a request of keys for a mixin would look more and more like a fingerprint. The likelihood of two different request being identical is probably already very small right now. And eventually, the chance of any two different requests having any intersection at all between their keys would probably be very small. So, suppose your lightweight node requests keys to different full nodes for a mixin. As per the assumption, there's a very little chance of any intersection between the sets. So, as soon as you push a transaction that contains one key provided to you by full node F, F would know with good probability that that transaction belongs to that lightweight node it had given that key in the past. And for this reason F would know the key it has given is not the true one. If F managed to collude with every other node that provided all the mixin keys used (Sybil attack), the collusion would be able to infer the true key. Perhaps adding keys returned as false positive for the bloom filter requests would help, since the true input is known to be part of it. Actually, thinking again, you must reuse keys from the bloom filter, because the bloom filter response itself may eventually become close to a fingerprint! Damn... that makes it even worse. If the false results of bloom filter requests are not used systematically in the mixin, the node which provided you with your output would always know when you spend it. It makes me wonder if bloom filters are a good idea at all. OTOH, there must be a way to reduce bandwidth demands for lightweight wallets, and besides bloom filters I don't know any other way.
The use of networks like Tor or I2P become even more important, since the attacker would have to kinda of break those networks in order to Sybil attack you. That raises the bar considerably.