Freshmen and seniors at about 200 colleges across the U.S. take a little-known test every year to measure how much better they get at learning to think. The results are discouraging.

At more than half of schools, at least a third of seniors were unable to make a cohesive argument, assess the quality of evidence in a document or interpret data in a table, The Wall Street Journal found after reviewing the latest results from dozens of public colleges and universities that gave the exam between 2013 and 2016. (See full results.)

At some of the most prestigious flagship universities, test results indicate the average graduate shows little or no improvement in critical thinking over four years.

Some of the biggest gains occur at smaller colleges where students are less accomplished at arrival but soak up a rigorous, interdisciplinary curriculum.

For prospective students and their parents looking to pick a college, it is almost impossible to figure out which schools help students learn critical thinking, because full results of the standardized test, called the College Learning Assessment Plus, or CLA+, are seldom disclosed to the public. This is true, too, of similar tests.

Some academic experts, education researchers and employers say the Journal’s findings are a sign of the failure of America’s higher-education system to arm graduates with analytical reasoning and problem-solving skills needed to thrive in a fast-changing, increasingly global job market. In addition, rising tuition, student debt and loan defaults are putting colleges and universities under pressure to prove their value.

A survey by PayScale Inc., an online pay and benefits researcher, showed 50% of employers complain that college graduates they hire aren’t ready for the workplace. Their No. 1 complaint? Poor critical-reasoning skills.

“At most schools in this country, students basically spend four years in college, and they don’t necessarily become better thinkers and problem solvers,” said Josipa Roksa, a University of Virginia sociology professor who co-wrote a book in 2011 about the CLA+ test. “Employers are going to hire the best they can get, and if we don’t have that, then what is at stake in the long run is our ability to compete.”

International rankings show U.S. college graduates are in the middle of the pack when it comes to numeracy and literacy and near the bottom when it comes to problem solving.

The CLA+ test raises questions about the purpose of a college degree and taps into a longstanding debate about the role of colleges: Are they are designed to raise students’ intellectual abilities or to sort high-school graduates so they can find the niche for which they are best suited?

The role of a diploma as signal of ability has been in the ascendancy recently, given how having a degree is closely related to graduates’ lifetime earnings. The test data, by contrast, show that many students earn their degrees without improving their ability to think critically or solve problems.

Tests such as the CLA+ can be used to fulfill a mandate by accreditors for schools to show that they are trying to assess and improve the education they provide.

The CLA+ measures critical thinking, analytical reasoning, problem solving and writing because it demands students manipulate information and data in real-world circumstances that require different abilities. It has been lauded by a federal commission that studied higher education in the U.S.

The test has detractors. It is hard to completely untangle cause and effect in something as complicated as improving critical-reasoning skills and as broad as a college education. And students don’t always try their hardest when they take the exam, since there is little at stake for them.

Colleges where students perform poorly say it is unfair to draw sweeping conclusions from a single test. They argue that students from different colleges shouldn’t be compared because freshmen have widely varying abilities. Some prestigious schools say their schools don’t show much improvement between the first and fourth years because their students are so accomplished when they arrive that they have little room to improve.

Colleges where students perform well on the test say it is an accurate gauge of their academic programs.

The CLA+ requires students to use spreadsheets, newspaper articles, research papers and other documents to answer questions, make a point or critique an argument. Colleges pay about $35 a student to the test’s creator and administrator, the Council for Aid to Education, a nonprofit group in New York.

At each college, the roughly 90-minute test is given to one group of freshmen in the fall of their freshman year and to a separate group of seniors in the spring of the same academic year. A statistical analysis of the difference between the average scores of the two groups is considered a valid way to reflect the value added during four years of college.

The Journal filed public-records requests with more than 100 public institutions where students took the CLA+ between 2013 and 2016. Sixty-eight of those colleges had at least 75 freshmen and seniors take the test in the same academic year, making the results for those schools statistically valid, according to the Council for Aid to Education.

The biggest point gain came at Plymouth State University, a college in New Hampshire with about 3,600 undergraduate students. Plymouth State seniors in 2014 had an average CLA+ score of 1,185 points, which was 178 points higher than the average freshman score at Plymouth of 1,007. The school’s total count, or “value-added score”—which includes factors such as graduation rates—put Plymouth near the top in the 95th percentile of schools that took the test in 2014.

Maria Sanders, an assistant philosophy professor at Plymouth State, says she isn’t surprised by its strong CLA+ results because her classes emphasize critical reasoning. In her philosophy of law class, students hold a mock trial of Lizzie Borden, the Sunday-school teacher accused in 1892 of hacking her stepmother and wealthy father to death.

“It’s not until the final exam until they realize how much they’ve learned,” said Ms. Sanders. “We don’t hand the students anything. I ask them to really think and discover what principle they will live their lives by.”

Adam Civinskas, a Plymouth State graduate heading to law school, said he and his classmates in a technical-writing class were assigned to devise a new class that would help them learn to write resumes and cover letters. He said they received almost no instructions on how to tackle this task.

His group interviewed other students and department heads, then built the class proposal from the research. “They gave us just enough information to make us ask ourselves the right questions,” said Mr. Civinskas. “That’s kind of the way everything works here.”

Overall, a majority of students at colleges that took the CLA+ made measurable progress in critical thinking, the Journal found. Colleges that added the most value aren’t necessarily highly ranked in areas that more often build a college’s reputation, such as faculty research, graduate programs, on-campus amenities, sports programs and the selectivity of the freshmen class.

“When it comes to how students select a college, we are clueless about quality,” said Tony Carnevale, director of the Georgetown University Center on Education and the Workforce. “The proxy we use is reputation.”

Flagship institutions such as the University of Kentucky and the University of Texas at Austin attract some of the brightest students in the country. Their students showed little improvement in CLA+ performance. Their value-added score put their ranking in the bottom third of all schools that gave the test in the same year.

Kentucky and UT Austin officials criticized the test and said they no longer use it.

At the University of Louisiana at Lafayette, three-quarters of seniors had “basic” or “below basic” levels of mastery, the two lowest ratings out of five used in the CLA+.

“I wasn’t as focused as I should have been, but in a lot of classes, we just watched videos and documentaries, and then we would talk about them. It wasn’t all that challenging,” said Jeremy Daigle, who graduated in 2011 and now works in a coffee shop in Lafayette.

The college said the test results don’t “reflect the rigor of our academic programs.” UL Lafayette said it no longer uses the CLA+.

Seniors who scored basic or below basic might not be able to “distinguish the validity of evidence and its purpose” or “determine the truth and validity of an argument,” according to the Council for Aid to Education. At least half of the seniors at a quarter of the schools reviewed by the Journal fell into those categories.

By contrast, more than 90% of seniors at California Polytechnic State University, San Luis Obispo; Miami University of Ohio; Ohio State University and the University of Georgia graduated with critical-thinking abilities rated as “proficient” or better. The two highest levels are “accomplished” and “advanced.”

Roger Benjamin, president of the Council for Aid to Education, said the test provides a sound assessment of the intellectual capital and capacity for innovation needed to succeed in the modern world. “That’s why measuring performance and working toward improvement are so critical,” he said.

At the Citadel in Charleston, S.C., 65% of seniors who took the test in 2016 were rated basic or below. The value added was in the low 2nd percentile among all the schools where students took the exam in 2016.

The Citadel, like many schools, enshrines the importance of critical thinking in its mission statement, pledging to make sure graduates “are capable of both critical and creative thinking…and possess the methodological skills needed to gather and analyze information.”

As officials grapple with the lackluster test results, Citadel English professor Jenna Adair has begun incorporating lessons on critical reasoning into her classes.

She asks sophomores to read “Beowulf” and pretend they are journalists covering a presidential race between three characters in the millennium-old epic poem. The students must generate criteria to evaluate the candidates, which pushes them to dissect the concept of leadership. Many struggle.