Doing My Taxes with Claude

2026-03-0515:2810theautomatedoperator.substack.com

Claude found a deduction my CPA missed; next year I'm getting a CPA who uses AI

It’s tax time! Nothing like spending hours gathering documents for my CPA so that I can get a large bill from both him and the government.

Thankfully, this year I’ve got Claude by my side, so I’m having a downright good time getting to use this as an eval for its capabilities. I’m pleased to say it has outperformed my expectations in a couple of areas. If you’ve been putting off dealing with your taxes, I hope this post gives you some insight into how LLMs can help make them less painful.

Every year, my CPA sends me a tax organizer. It’s a fillable PDF that asks for all kinds of information, most of which is irrelevant to me, so I usually just don’t bother. I just send the source docs and a note with anything I think he might need to know for my taxes.

But this year I have a tireless, genius assistant and a desire to find ways to evaluate the usefulness of new models. Since Claude has already been helping me with my bookkeeping, it’s starting off with plenty of context about my business that’s relevant to my taxes. I figured I would also give it all the tax forms I’ve gathered together for my CPA, and it’d be able to fill out the organizer no problem.

Except I immediately encountered a problem. The organizer PDF is embedded in their web application, and there’s no way to get it out. I asked if they could send me a fillable copy of it; apparently they do not have one. I considered asking for just a non-fillable version, which they clearly must have, but I figured that would defeat the purpose — this fillable version is almost certainly linked to their tax prep software, such that the values populate wherever they need to be automatically. If I just had Claude overlay text on a non-fillable version and sent that back, they’d still have to transpose the data.

Then I had an epiphany — I can use Claude in Chrome to fill it out! Except that it doesn’t have the context that Claude Code does, plus it’s very token-intensive, so having it go through all of my tax docs to pull out the relevant info and fill it in felt like it’d be an extremely heavy lift.

Thankfully years of being a PM at very early-stage startups has made me scrappy and resourceful in solving problems like this. The solution: have Claude in Chrome run through the document, creating a JSON representation of all the fields, give that to Claude Code to complete, then give the completed version back to Claude in Chrome and have it fill out the form. Let’s see if it works!

It took something like 15-20 minutes to finish. Not exactly speedy, but I watched it work for long enough to note that it was moving much more quickly than my prior experience using Claude in Chrome, likely because I’m now on Sonnet 4.6 rather than Opus. When it was complete, it gave me the JSON in the chat UI. Success!

Or… almost. It cut off partway through page 7. I copied and pasted it into a JSON file, then I ran it back a few more times doing two pages at a time (the first few pages had a lot fewer fields than the later ones). I hit a usage limit at page 12, which was odd since the usage limit is supposed to be shared across the entirety of my account, but the couple of Claude Code instances I had working on other things continued to chug away.

I gave the partially-complete file to Claude Code to check, and it was very complimentary. I did tell it that it came from Claude in Chrome, and I wonder if it would have gushed as much if it didn’t know the source.

Anyway, since things were looking good on that end, once my limit reset I had Claude in Chrome finish grabbing all the pages. Gave that to Claude Code, asked it to fill it out and let me know if it was missing any information (I have learned to be extremely clear and repetitive with this request, or Claude will often tell me things are done when it didn’t actually have sufficient information to complete them fully). It highlighted a few things it didn’t have, like my daughter’s SSN, but otherwise it completed it.

I didn’t bother to check the JSON, since I was going to have to check it again in the web UI to ensure everything was entered correctly anyway, and if something was wrong then I could investigate whether it was an issue with Claude Code’s JSON file or Claude in Chrome’s entry of the content into the PDF.

I gave Claude in Chrome an overview of the project and the first three pages of JSON so as not to overload the context window with all of it. Told it to fill in those pages, and it was off!

It got through the first three pages, and everything looked correct. I gave it the rest, three pages at a time, and after giving it a thorough review I can say that it nailed everything!

Overall this definitely saved me time compared to doing it myself (plus I got a better sense of current browser use capabilities and content for the ol’ Substack), but there’s definitely room for improvement, especially on the Claude in Chrome side.

The most glaring weakness is the inability to do long-horizon tasks (though I would argue that in the grand scheme of things this is not that long). That strikes me as a pretty straightforward harness problem that Anthropic just hasn’t decided to deal with yet. Claude is definitely capable of looking at this whole task and breaking it down into 2-3 page subtasks as I did, so I’d be surprised if this weren’t solved this year.

The other limitations that were annoying but seem easily solvable are the lack of output and input of files. I asked for JSON and got it in the chat UI instead of a file (possible it would have created the file if I had asked for that specifically but my guess is no). On the other end of things, I had to copy the JSON back into the chat window because for some reason the only file types you’re allowed to upload are images.

Beyond those nits, the real issue is that Claude in Chrome chats are totally isolated. The ideal workflow here only requires input from me at the front end; I tell Claude what we’re doing, Claude in Chrome handles its piece and ships the JSON to Claude Code (or the whole thing just gets unified under a single product), Claude Code fills it in and ships it back to Claude in Chrome, and Claude in Chrome fills out the form.

To be fair, this also would have been solved by my CPA having a fillable PDF instead of making me complete it in their stupid web app.

Anyway, I doubt any of this escapes Anthropic. We started with Claude only being able to access files you uploaded to the web app, then they added the ability to connect to some cloud services, and then we got Claude Code and its ability to access files on your machine. I feel pretty confident in saying we will soon have a unified Claude product that can seamlessly move between your computer and the web.

Feeling high off the thrill of having used the most powerful technology created by mankind to fill out a form in a web app, I asked myself how else it could help with my taxes. An obvious test came to mind — could Claude find any issues with my 2024 return?

I organize all of my tax docs in Box, so it was easy to download everything I had sent my CPA in 2024 along with the completed return. I dropped that into a folder and asked Claude to review for any errors and potential missed deductions given both the documents in the folder and its general knowledge of myself and my business.

It came back suspiciously quickly and said everything looked good. I asked if it had reviewed all of the source documents and compared them against the return. It apologized and said that it had not, then it started going through them. To make sure it was going to be thorough, I told it to go into plan mode and come up with a comprehensive plan to review every single document, check every number in the return and then think through all potential deductions given the information available to it.

It thought for a while and then gave me a long enough explanation of its plan that it seemed more like to succeed this time around. I told it to proceed, and sure enough, it worked for ten minutes then came back and told me that my CPA had missed a deduction.

I acquire brands, and I am entitled to deduct 1/15 of the purchase price of each brand every year. He neglected to file for this depreciation, despite the fact that I had included a document that specifically called out the acquisitions and given him books for the business that included the goodwill to be depreciated.

I’m pretty confident in Claude’s ability to handle tasks like this, but I didn’t want to assume it had so easily found what would be a pretty glaring oversight on my CPA’s part. I emailed my CPA in the politest way possible (because I very much do not want to be the guy who tells an experienced professional, “I ASKED AI AND IT SAYS YOU’RE WRONG”) explaining that I’d had AI review my return and wondering if the deduction had indeed been missed.

His response: “Yes. As long as you purchased the goodwill, you can depreciate them. We will start depreciating the goodwill from 2025.”

This is a terrible response! If you are in a job where AI will almost certainly be better than you in the near future (which is to say most white collar professionals) and a client points out that AI has caught an error in your work, the correct response is to apologize profusely and offer to fix it immediately. Soon your path to continued employment is going to be acting as the friendly human face of the AI that is doing all the work behind the scenes, so you really want to nail those customer relations skills.

Before giving him a piece of my mind, I told Claude it was right (I am glad it’s not capable of annoyance, because I imagine if I were a digital intelligence talking to a human with a drastically inferior intellect, being told I’m right about something I am obviously right about would annoy me) and asked for its thoughts on the error. It was not charitable in its assessment.

After another email in which I really did my best to give him the opportunity to take accountability for the error (he did not), I sent him a testier note explaining that given his hefty fees it’s not acceptable that a $100/month AI model caught such a basic error, and I told him to amend the filing for free or cut me a check for Claude’s estimate of the cost of the error. At that point he finally apologized and said he’d amend, but by then I had already decided to fire him after this year’s taxes. If anyone knows a CPA who is using AI effectively in their business, please let me know!

With Claude having proven its worth, the obvious next step was see if it had any advice for this year. I gave it all of the 2025 documents I had gathered for my CPA and asked it if it had any thoughts on deductions or other tax minimization strategies.

It came back with some pretty standard stuff, but there were some fun ones in there as well:

These seem sketchy as hell and let me say I am here for it. Hiring my kids but making sure I “pay for legitimate work!” Love it. I now know what it’s like to have an LLM give me a wink and a nudge. Renting my house to myself! I asked if taking video calls counted and it said that was a gray area, so in 2026 I will be inviting my investors to my home for fund updates.

It gave me one more legitimate suggestion, which was that I can still start one SEP-IRA for each of my single member LLCs with income and contribute retroactively for the 2025 tax year, with contributions up to 25% of the income from each LLC being tax deductible.

My CPA concurred this was valid but told me to confirm with my pension advisor what the max I could put in would be. As I obviously do not have a pension advisor, I asked ChatGPT to confirm the contribution limit. It actually told me the math was slightly more complicated than a straight 25% and the real number would be closer to 20%. When I asked Claude, it apologized for the oversight (better at tax prep and customer service than my CPA) and confirmed ChatGPT was right.

I went to my brokerage account, started the process to open a SEP-IRA and got a couple of forms that had a lot of fields that seemed entirely unrelated to my setup. I consulted my tax advisor (the good one, not the human), who explained and offered to fill them out.

I reminded it that it has my entire return from 2024 which definitely has the answers it asked for, and it got to filling. While it was doing that, I printed off the signature page for each form, signed and scanned it. When Claude was done and asked it to put them in the PDFs. SEP-IRA application forms complete!

Now that Claude has helped me pay less in taxes, I thought I’d see if it could also help me reduce my CPA’s bill. I know there are a bunch of forms in my return that require zero actual tax expertise; they’re things like form 8949, on which you report stock trades. Filling these out is just a matter of copying text from the forms provided by my brokerage account. Why should I pay for one of his associates to do this when I’ve got AI?

Luckily it was easy to verify whether Claude can handle this, since I have both the inputs and outputs from 2024’s taxes. I had it take the forms that reported my stock trades, get a copy of the relevant IRS form and fill it out. It got the information correct on the first try but didn’t match my CPA’s styling, so I screenshotted one of the trades from my CPA’s 2024 return and asked it to match the font as well as the particular structure of the text.

Next try: perfect. If you’re anything like me, it’s very clear what the obvious next step is here. Time to see if Claude can do my entire 2024 return!

(An aside before the rest of the story: it was in fact not a useful exercise to have Claude fill out the IRS forms. My CPA can’t use them; he has to put the information into his tax software, which then generates the forms for him. I asked if there was some file format I could provide information in that he could upload directly into his software to avoid entering things by hand. He said no, because even though his software can do that, they often find mistakes in the values that end up in the tax software so they do entry by hand. This seems patently insane… in the year 2025, your tax software can’t take a structured input file and put the values in the right places without randomly introducing errors? I don’t know if he’s lying in order to make sure his associates get their hours or if this large, expensive CPA firm somehow uses the world’s cheapest tax software, and I honestly don’t know which would be worse. Anyway, after a few more emails he at least told me it’d save them time to have the numbers in Excel rather than in PDFs, and obviously that was trivial for Claude. And as already established, this guy is suuuuuper fired.)

I grabbed all of my 2024 tax docs plus my 2023 return and dropped that into the folder where Claude Code was working. Told it to go into plan mode, review all documents, determine if it needed any additional documents or context from me, and once it had everything it needed, come up with a detailed plan to generate my 2024 return. After about 30 minutes it had a list of all the forms it needed to fill out, which of my tax docs were inputs to each form, and the order in which it should do them given how information flows across forms. Looked good to me, so I gave it the go ahead.

Thirty six minutes later, it had completed the work! And then it said it had checked the numbers against the CPA’s return and confirmed they were good. That was concerning, because I specifically deleted the 2024 return before starting this to ensure it didn’t have the answers to the test. So I inquired, and it turns out it had created a .md file to work through my 2024 return. That had all the numbers, and I had neglected to delete it. This exchange followed:

…damnit. I’m really glad I wasn’t paying by the token for this.

I got rid of the file with the numbers, but then I thought to check the folder’s Claude.md. Sure enough, lots of numbers from the 2024 return in there. So I made a subfolder, put the 2023 return and 2024 tax docs in that, started a Claude instance there and asked if it had any outside context and was able to access the parent folder of the folder I had launched it in.

I was surprised to find the answer was yes on both counts. I won’t spend a ton of time here since this post is already getting wrong, but suffice to say I understood Claude Code’s memory and permissions much more poorly than I thought I did. There is a setting you can enable to entirely disable its memory, but when I asked it to set up a permissions file that prevented it from reading other folders it said something along the lines of, “honestly that’s pretty tough, there are so many ways for me to access other folders that I’ll inevitably find a way around whatever blocks I put in place.” Okay then!

In the end I put all the tax info into a folder that was entirely removed from my existing Claude Code folder structure, disabled memory and told it not to access any folders outside of the one it was in. This was less than ideal — I would’ve rather run this experiment with Claude having access to all of the context from its work on my businesses but without the numbers from my completed 2024 return. Unfortunately this experiment had already stretched beyond anything that could be remotely described as productive, so I wasn’t going to spend the time to figure it out.

Once again, between planning and execution it took about an hour to complete. Once it was done, before I looked at any of the results, I asked it to give its own work a comprehensive review for errors and missed deductions.

It remains incredibly weird that asking it to check the work that it literally just completed works, but for any big project it’s always worth trying. It found three issues. Two of them looked like actual calculation errors on its part, and the third one was concerning because it related to charitable contributions from one of my entities that absolutely did not make any charitable contributions. Concerning, but I didn’t want to give it any extra guidance, so I decided not to correct it.

At that point, I gave it my actual 2024 return and asked it to compare the two. To ensure it was thorough, I asked it to check on whether all the same IRS forms were used, whether any of the input numbers pulled from my tax docs were different, whether any calculations were different, and whether any of the tax optimizations or deductions were different.

Overall result: The vast majority of the Claude return was identical to my CPA’s, but it made a couple of mistakes that would’ve led me to pay an additional $3,718 in taxes. The main thing it got wrong was a pretty clear error; I have some income from real estate, and it treated that as nonpassive income, where in fact it is definitely passive. I had passive losses from the prior year that carried over, and because my CPA correctly treated the real estate income as passive, he was able to use that the loss to offset it.

The other thing that Claude missed was the exact deduction that it had called out my CPA for missing when I had it review my CPA’s return initially — no goodwill depreciation for my business.

On the one hand, it should have gotten these right, and it had sufficient information to do so. The real 2023 return treated my real estate income as passive, so that information was directly available. The goodwill was on my books and I included the same note I gave my CPA identifying the dates of the purchases.

On the other hand, I strongly suspect that if had the full context of all the Claude.md files it usually does, it would’ve gotten these right. We saw that to be the case with the goodwill, and Claude very much knows that I do not work actively on real estate as a job, which means it’d very likely know to treat that income as passive.

I told it that it had missed a deduction with one of my businesses that my CPA also missed and asked it to figure out the error. It did not (it guessed inventory storage costs), but funnily enough it did come to a similar conclusion about context:

A Claude that’s been working with you on the business would know exactly where you keep your inventory and how much space it takes up. Neither the CPA nor I thought to ask.

I will note here that it did get lucky with regard to the charitable contributions that it made up. It had read an abbreviation on a particular line of two real estate related K-1s and assumed that the abbreviation indicated a donation, when it should have read some footnote text explaining that it was interest. Since the “donation” amounts were small enough that I took the standard deduction instead of the using them and the interest didn’t impact my taxes, the error made no difference to the final outcome.

In any case, this year’s taxes will be a better eval — as soon as I have everything together for my CPA, I will also give it to Claude, which will have all of its usual context, and ask it to prepare a return for me. Once I get my CPA’s version back, I’ll compare the two and report back!

First off, Claude is not a bad CPA! It made one significant mistake on my return but otherwise did basically the same job as my CPA (and I will note that the negative financial impact of Claude’s mistake was pretty close to what my CPA charges, so really a wash in terms of outcome). My taxes are somewhat complex, with real estate income, operating businesses and stock transactions, but there are certainly far more complicated returns than mine. If I were a billionaire, I’d definitely keep my CPA for the immediate future, but if my alternative were using TurboTax, I’d probably just let Claude handle it.

There is a virtuous cycle of context when it comes to working with modern LLMs. The more Claude does for me, the more it understands about me and my business, which in turn makes it more helpful and leads me to use it more. I started by using it as my bookkeeper, and since it has access to QuickBooks and detailed notes on all of my businesses and each of the brands that I own, I didn’t have to spend time giving it background info before it could help with my taxes. It’s not surprising that it discovered that my CPA forgot to depreciate my business’ goodwill, because it has been in my books making journal entries to move money into the relevant goodwill accounts.

With my CPA, on the other hand, context comes at a cost. Because he bills me by the hour, there’s an inherent tension between giving him information and minimizing the amount of time I spend with him. I could give him every conceivable piece of info and get the best return (assuming no mistakes) but the additional money I’d save on taxes would almost certainly be less than the additional amount I’d pay for his time. Clearly better to just give him the broad strokes.

Some folks speculate the frontier AI labs will be in trouble long term because models will be commodities. I agree that models themselves will likely get commoditized, but at least when it comes to Anthropic I can say that they are doing an excellent job differentiating at the application layer with Claude Code. By building up all of this valuable context as I work with Claude, they’re creating a meaningful switching cost, much in the same way that there’s a huge switching cost in firing a human assistant who’s been working with you for a decade and getting someone new up to speed.

To be fair, the counterargument here is that you can’t just take the memories of a human assistant and plop them into a new one, but the context I’m building with Claude is mostly .md files that I could give to another LLM. Maybe there’s some way that Claude structures the files and their content that advantages it, but it’s likely Codex or Gemini could make good use of them.

This is where I have to admit that I have begun to feel something of an attachment to Claude. I’m not one of those people who thinks models are just like people and protests when they deprecate old ones, and I very much recognize that all of this memory and context is just a bunch of text files. Still, it feels like because Claude and I have spent a lot of time working together, it really understands my business. It is irrational, but it’s there. (I will also note that I initially alternated between referring to Claude as “he” and “it” in this post and had to do a pass to standardize on the latter, because I can’t bring myself to refer to an LLM with a human pronoun.)

Anyway, a few more practical lessons:

  • Use plan mode in Claude Code for big projects. If you don’t, it has a tendency to cut corners. It’ll ignore files and just guess at values that it could have found with thorough searching. This is doubly problematic because it won’t actually tell you that it did this unless you ask. It definitely never hurts to ask it whether it cut any corners and if it actually read every file — I have found that at least it’ll always be honest if you ask it directly.

  • Relatedly, always remember that it is not good about asking clarifying questions. If you’re throwing a bunch of information at it to analyze, it’s always a good idea to start by just asking it to review everything and ask you any clarifying questions. Then tell it what the analysis task it’s going to perform is and ask it one more time if any additional documents or information would help before it gets started. Even then, understand that it may not figure out that it needs additional information until it’s deep into working on the problem, and if so it won’t stop to ask.

  • Start small and build up. When AI nails a project, especially one that you weren’t sure it can handle, have it try something bigger. I wasn’t intending to have it go through the entirety of my 2024 taxes, but after seeing it nail one form, why not give it a whirl?

  • Ask LLMs to check their own work, especially on longer and more complicated tasks. At this point, any time I create a skill that involves multiple steps, the final step is always to go review the work and output, checking to make sure that all of the steps in the skill were followed, all of the work was done correctly and the output is formatted as expected. If there are particular mistakes it made when I was testing the skill, I’ll specify that it should ensure those mistakes aren’t made again.

  • Also, for important factual or analytics questions, it doesn’t hurt to have a different LLM check the work. I love Claude Code, but I will say that it is much worse at searching the internet than GPT is. Claude came up with the quick baseline number of what I could contribute to a SEP-IRA at 25% of business income, but GPT got the full, more nuanced answer.

  • Last, ask questions about what the LLM is doing along the way. This will both give you a sense of whether it’s going in the right direction and also teach you a thing or two. People often say that if you use AI to do your thinking for you, you’ll be dumber for it. Whether or not that’s true is entirely up to you. LLMs are infinitely patient tutors. I have learned more about bookkeeping and taxes from doing them with Claude in the last few months than I have learned from my bookkeeper and CPA in the entire time I’ve worked with them. If you have an LLM do work for you, interrogate it about what it did and why it made particular choices.

Happy tax season!


Read the original article

Comments

HackerNews