How does Texhshop count words?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How does Texhshop count words?

Themis Matsoukas-6
I used Edit/ Statistics… to count words in my source file. With the entire file selected (from first line to last, comments and all), I get a word count of 3045/20021 characters. If I select from my \begin{abstract} to the end of the file, I get 3381 words/22663 characters! I didn’t expect the count to be exact but I would have thought that a subset of the document would give a lower count that the whole. With some experimentation I discovered that when  the line \begin{document} is excluded,  the count increases by ~300 words. That’s with a revtex document, I am not sure if the same is seen with other classes.

Not a big deal but since Dick is a mathematician I thought he’d want to know that there is a discontinuity that breaks monotonicity in the count :)

Themis




----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: https://www.esm.psu.edu/~gray/tex/
List Archives: http://dir.gmane.org/gmane.comp.tex.macosx
                https://email.esm.psu.edu/pipermail/macosx-tex/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: https://email.esm.psu.edu/mailman/listinfo/macosx-tex
Reply | Threaded
Open this post in threaded view
|

Re: How does Texhshop count words?

Herbert Schulz
> On Dec 20, 2017, at 9:29 AM, Themis Matsoukas <[hidden email]> wrote:
>
> I used Edit/ Statistics… to count words in my source file. With the entire file selected (from first line to last, comments and all), I get a word count of 3045/20021 characters. If I select from my \begin{abstract} to the end of the file, I get 3381 words/22663 characters! I didn’t expect the count to be exact but I would have thought that a subset of the document would give a lower count that the whole. With some experimentation I discovered that when  the line \begin{document} is excluded,  the count increases by ~300 words. That’s with a revtex document, I am not sure if the same is seen with other classes.
>
> Not a big deal but since Dick is a mathematician I thought he’d want to know that there is a discontinuity that breaks monotonicity in the count :)
>
> Themis

Howdy,

If I had to guess I'd say it uses `detex` (to remove commands) and then `wc` to count words.

For curiosities sake try downloading DropTeXCount.zip and use of of the two drop-scripts enclosed in the unzipped folder to do a count in the whole document. You can get that file from <https://herbs.github.io>. Those scripts use `texcount` (part of TeX Live) to do the word count.

Good Luck,

Herb Schulz
(herbs at wideopenwest dot com)



----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: https://www.esm.psu.edu/~gray/tex/
List Archives: http://dir.gmane.org/gmane.comp.tex.macosx
                https://email.esm.psu.edu/pipermail/macosx-tex/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: https://email.esm.psu.edu/mailman/listinfo/macosx-tex
Reply | Threaded
Open this post in threaded view
|

Re: How does Texhshop count words?

Themis Matsoukas-6
Hi Herb,

Thanks for the scripts. I used DropTeXCount and got the following:

        • Encoding: utf8
        • Words in text: 2866
        • Words in headers: 24
        • Words outside text (captions, etc.): 0
        • Number of headers: 8
        • Number of floats/tables/figures: 0
        • Number of math inlines: 236
        • Number of math displayed: 41

The word count is in the ballpark of TeXShop, which gave about 3000.

Themis





> On Dec 20, 2017, at 10:50 AM, Herbert Schulz <[hidden email]> wrote:
>
>> On Dec 20, 2017, at 9:29 AM, Themis Matsoukas <[hidden email]> wrote:
>>
>> I used Edit/ Statistics… to count words in my source file. With the entire file selected (from first line to last, comments and all), I get a word count of 3045/20021 characters. If I select from my \begin{abstract} to the end of the file, I get 3381 words/22663 characters! I didn’t expect the count to be exact but I would have thought that a subset of the document would give a lower count that the whole. With some experimentation I discovered that when  the line \begin{document} is excluded,  the count increases by ~300 words. That’s with a revtex document, I am not sure if the same is seen with other classes.
>>
>> Not a big deal but since Dick is a mathematician I thought he’d want to know that there is a discontinuity that breaks monotonicity in the count :)
>>
>> Themis
>
> Howdy,
>
> If I had to guess I'd say it uses `detex` (to remove commands) and then `wc` to count words.
>
> For curiosities sake try downloading DropTeXCount.zip and use of of the two drop-scripts enclosed in the unzipped folder to do a count in the whole document. You can get that file from <https://herbs.github.io>. Those scripts use `texcount` (part of TeX Live) to do the word count.
>
> Good Luck,
>
> Herb Schulz
> (herbs at wideopenwest dot com)
>
>
>
> ----------- Please Consult the Following Before Posting -----------
> TeX FAQ: http://www.tex.ac.uk/faq
> List Reminders and Etiquette: https://www.esm.psu.edu/~gray/tex/
> List Archives: http://dir.gmane.org/gmane.comp.tex.macosx
>                https://email.esm.psu.edu/pipermail/macosx-tex/
> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
> List Info: https://email.esm.psu.edu/mailman/listinfo/macosx-tex


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: https://www.esm.psu.edu/~gray/tex/
List Archives: http://dir.gmane.org/gmane.comp.tex.macosx
                https://email.esm.psu.edu/pipermail/macosx-tex/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: https://email.esm.psu.edu/mailman/listinfo/macosx-tex