Upload/Processing Error

apcraig · May 28, 2020, 4:59pm

Description

Upload files are not showing up at Codecov.io

Repository

CI/CD

github, our own hardware, bash uploader

Uploader

bash uploader

Commit SHAs

Codecov YAML

coverage:
range: “20…100”
round: down
precision: 2

comment: false

Codecov Output

…
+ /glade/work/tcraig/cice-consortium/test_cice_icepack.200527cc/testsuite.200527cc/cheyenne_gnu_smoke_gx3_4x4_alt05_debug_short.200527cc/codecov_output/#glade#work#tcraig#cice-consortium#test_cice_icepack.200527cc#cicecore#shared#ice_restart_shared.F90.gcov bytes=3799
==> Appending adjustments
Fixing Reports
+ Found adjustments
==> Gzipping contents
==> Uploading reports
url: https://codecov.io
query: branch=master&commit=25dd6d76f4a9f40c4dc2e1eddf737b436721e15d&build=&build_url=&name=25dd6d76f4%3Amaster%3Acheyenne%20first_suite%2Cbase_suite%2Ctravis_suite%2Cdecomp_suite%2Creprosum_suite%2Cio_suite%2Cquick_suite&tag=&slug=apcraig%2Ftest_cice_icepack&service=&flags=&pr=&job=
→ Pinging Codecov
https://codecov.io/upload/v4?package=bash-20200430-d757c17&token=secret&branch=master&commit=25dd6d76f4a9f40c4dc2e1eddf737b436721e15d&build=&build_url=&name=25dd6d76f4%3Amaster%3Acheyenne%20first_suite%2Cbase_suite%2Ctravis_suite%2Cdecomp_suite%2Creprosum_suite%2Cio_suite%2Cquick_suite&tag=&slug=apcraig%2Ftest_cice_icepack&service=&flags=&pr=&job=
→ Uploading
→ View reports at https://codecov.io/github/apcraig/Test_CICE_Icepack/commit/25dd6d76f4a9f40c4dc2e1eddf737b436721e15d

Additional Information

When we run smaller test suites, the file upload seems to work. When we run larger test suites, the reports do not show up on Codecov.io. The bash uploader does not generate any errors, but then the upload file does not show up on the codecov.io site.

apcraig · May 28, 2020, 11:33pm

More information:

The file I’m trying to upload is 31 MB gzipped, 523 MB unzipped. It contains gcov results from 137 distinct test runs and 11,548 separate gcov files. If that’s a problem, how should I proceed?

If I run the uploader scripts on all files once, that big file doesn’t show up (the current situation). I have also tried to run the uploader script on each test separately so I should have 137 files, but then most of the upload files don’t show up at codecov either. I have confirmed that if I run a subset of the tests, say 10 (of 137), then the process works fine.

If I understand the current implementation, it just concatenates all the gcov files into one file (523 Mb), then gzips and sends it to codecov.io. I guess Codecov then parses the file and aggregates the results. Is there a way to process each test sequentially so the results are aggregated on the client side, so an aggregated result is sent to codecov.io? That would mean one file sent to codecov that looks to codecov as if it were one test run.

tom · June 3, 2020, 4:24pm

Hi @apcraig, it looks like the large file upload is indeed timing out. You mentioned trying to run the uploader on each test separately. Would you be able to try that, and we can debug that situation? Unfortunately, I don’t have a good answer to uploading very large files.

apcraig · June 8, 2020, 7:14pm

We could break the testing down into smaller suites, although it would slow down our workflow. The other issue, as I alluded to is that when I’ve tried to upload multiple reports, I have mixed results. Often times those reports don’t show up. Large reports time out as noted here. If we send hundreds of small reports, most don’t show up either. In the end, we haven’t had a lot of success getting codecov to handle our upload reports. It seems there is a very specific sweet spot with regard to size and number.

tom · June 8, 2020, 7:57pm

Hi @apcraig, I understand that’s probably pretty unsatisfying of an answer. Right now, half a gig is pretty large as far as uploads go. If you were to switch to multiple reports, what can we do on our end to help support that? How long ago were you able to try doing multiple reports?

apcraig · June 8, 2020, 9:07pm

Thanks for the help @tom. I appreciate it. I don’t know how far we’re willing to go to fit into the sweet spot of successful codecov upload reports.

I’ve been trying to use codecov off and on with limited success for over a year. It works fine just enough to give me some encouragement, but has failed too often when I run a full test suite. We actually love how it works and what is produced. But as you can understand, it’s not that helpful to just get 10%, 50%, or even 90% of the results. We need 100% of the data to be visualized for the tool to be useful, and that has been elusive. I also suspect the way we operate is somewhat different from others. We have a large science code that is written in Fortran with many different features that have to be tested separately.

I have raised several issues in the past and haven’t seen a lot of improvements. To be honest, we are likely to move to lcov at this point. I was prototyping it a couple weeks ago for the first time, and it seems to be much more reliable and handles our full test suite without a problem. lcov is old, requires some installation on platforms where we want to test, does not aggregate multiple test suites as easily, requires that we manage the web posting, and the web reporting is not as nice. But lcov is faster, the cost to manage support scripts seems likely to be much less than the cost of working with codecov, and most important, lcov seems to be robust. We will keep the codecov scripts available in our workflow, and may test occasionally with the hope that we can move back to it. But until codecov is working more reliably, we’ll probably use lcov for our regular code coverage analysis.

Again, we really do love what codecov does when it works. But we’ve spent way more time than we wanted trying to understand and debug problems, and we’ve run into many more basic issues than we expected. A lot of the problems have been silent and happening on the codecov side which makes them nearly impossible for us to understand. We’ve spent a bunch of time trying to figure out whether we’re doing something wrong when we weren’t.

My recommendation for codecov would be to spend more time getting the upload and reporting more reliable. Even if that means degrading some features or reducing the performance a bit. The number one requirement for codecov has to be that all data sent to it is received, checked, processed, and posted correctly. If you can’t do that, features and performance don’t seem to matter. I believe others are having similar problems with regular issues created about report merges failing, processing hanging, reports being inaccurate, and other things.

Again, thanks for the help from you @tom and @drazisil. I do hope the tool continues to improve. As I said, we’ll try to find time to keep our toe in. If we can help test upgrades in the future, please feel free to reach out to us.

tom · June 9, 2020, 2:42pm

Hi @apcraig, really appreciate the well-written response here, and I appreciate your honesty. Although we are sad to see you leave, at the end of the day, what’s most important is that you are able to use a solution that works. We will work diligently to shore up and improve our product so that the next time around, you find both high quality features and performance, as well as reliability.

We’d love to stay in touch to help test upgrades. I hope we will change your mind about using Codecov in the future.

Topic		Replies	Views
Uploading multiple reports is not showing on codecov.io Support	6	697	January 28, 2020
Badge showing 0% coverage Support	5	1387	January 8, 2020
Codecov stuck in "Processing..." for commits of a public GitHub repository Bug Fixes	22	2463	November 6, 2020
CodeCov not uploading coverage report Support	1	340	March 27, 2020
Cannot upload to codecov after upgrade to 0.2.8 codecov-ruby gem Bug Fixes	17	1382	September 8, 2020