Introduction

I often find myself having to decide between making a project in VC++ or Perl and having to make it one or the other, not both. Perl is wonderful for
string manipulation, hashes and arrays of arbitrary objects, and DWIM (Do What I Mean) behavior. VC++ is fast, has excellent type checking and debugging, and
the resulting program can be easily packaged up for other machines. Perl requires that the target machine has Perl already installed. Some operations
are one or two lines in Perl and 100 or 200 lines in VC++ (and vice versa). Perl is very fast for prototyping, etc. ad nauseum.

I have seen the manual pages for Perl (perlguts, perlembed, perlapi, ...) showing how easy (ha!) it is to embed Perl into C/C++, but they are almost
incomprehensible to somebody who doesn't get into the guts of Perl. Almost as bad as OLE!

Further, even with the code to have embedded Perl, there is still the issue of getting C++ variables into and out of that instance of Perl. Even more
arcane magic is required. This led me to spend some time reading and testing Perl embedding capabilities. Virtually everything I have here comes from
the Perl manual pages, particularly perlguts, perlembed, and perlapi. These are not for the faint of heart. They certainly aren't for casual use.

This effort, plus a little experience in real-world applications using embedded Perl, yields the following:

Class CPerlWrap

Update (21-Feb-2012): See also CPerlWrapSTL in the source archive for a non-MFC version, courtesy of CodeProject member SLJW (a.k.a., jwilde).

This class allows you to create an instance of Perl, pass variables into and out of that instance, and run arbitrary scripts. The instance "stays
alive" until explicitly destroyed, so you can run many different scripts without re-instantiating.

The three major variable types in Perl are the scalar ($abc), the list (@def), and the hash (%ghi) which correspond
to MFC types of CString/int/double (for scalars), CStringArray (for lists), and CMapStringToString (for hashes).
For each of these, there is a get and a set function:

// These are used to create and populate arbitrary variables.
// Good for setting up data to be processed by the script.
// They all return TRUE if the 'set' was successful.
// set scalar ($varName) to integer value
BOOL setIntVal(CString varName, int value);
// set scalar ($varName) to double value
BOOL setFloatVal(CString varName, double value);
// set scalar ($varName) to string value
BOOL setStringVal(CString varName, CString value);
// set array (@varName) to CStringArray value
BOOL setArrayVal(CString varName, CStringArray &value);
// set hash (%varName) to CMapStringToString value
BOOL setHashVal(CString varName, CMapStringToString &value);
// These are used to get the values of arbitrary
// variables ($a, $abc, @xyx, %gwxy, etc.)
// They all return TRUE if the variable was defined and set
// get scalar ($varName) as an int
BOOL getIntVal(CString varName, int &val);
// get scalar ($varName) as a double
BOOL getFloatVal(CString varName, double &val);
// get scalar ($varName) as a string
BOOL getStringVal(CString varName, CString &val);
// get array (@varName) as a CStringArray
BOOL getArrayVal(CString varName, CStringArray &values);
// get hash (%varName) as a CMapStringToString
BOOL getHashVal(CString varName, CMapStringToString &value);

So if I have a CString that I want to do something Perlish on, for instance extracting all the words into an array of words, here is my VC++ code:

// perlInst is an instance of CPerlWrap
CString str("this is a verylong set of words"" that would be a pain to deal with in C++");
perlInst.setStringVal("string",str);
perlInst.doScript("@b = split(/\s+/, $string);");
CStringArray words;
perlInst.getArrayVal("b", words);

(Yes, this could be done in C++, but it's an easy example!)

Or perhaps I want to capitalize each word in that string, using the following VC++ code:

// perlInst is an instance of CPerlWrap
CString str("this is a verylong set of ""words that would be a pain to deal with in C++");
perlInst.setStringVal("string",str);
perlInst.doScript("$string =~ s/(\w+)/\u\L$1/g;");
perlInst.getStringVal("string", str);

The results:

This Is A Verylong Set Of Words That Would Be A Pain To Deal With In C++

Or how about getting the first non-trivial-sized plural word and some context?

CString script(
"$a = \"this is a verylong set of ""words that would be a pain to deal with in C++\";\n""$a=~ s/(\w+)/\u\L$1/g;"
);
perlInst.doScript(script);
perlInst.getStringVal("a",str);

As it happens, this particular script doesn't really need the embedded new-line \n, but if you want
the errors message to point to something other than line 1, you'll add new-lines.

Error detection and error messages

Error messages? Well, startling as it may seem, sometimes there are errors in the Perl script that you run. It never happens to me (#include <NoseGettingLonger>)
of course, but I've included some support for it. Here is an example showing an error and getting access to the problem report from Perl:

// this is missing the ';' at the end of the first line
CString script(
"my $d = 'this is a verylong set of words'\n""$d =~ m/(\w+)\s+(\w{3,}s)\s+(\w+)/;"
);
if(!perlInst.doScript(script))
{
CString errmsg = perlInst.getErrorMsg();
if(!errmsg.IsEmpty())
errmsg = getWarnings();
MessageBox(errmsg,"Script Failure");
}

Which yields:

Scalar found where operator expected at (eval 18)
line 2, near "'this is a verylong set of words'
$d"
(Missing operator before
$d?)

By default, warnings are not considered errors and all warnings are cleared before a script is executed. But if you want to easily detect warnings and
errors, you can use these two functions to tune CPerlWrap's behavior:

// set to TRUE if warnings cause doScript() to return FALSE
BOOL SetfailOnWarning(BOOL);
// set to TRUE if warnings are
// cleared before executing a doScript()
BOOL SetclearWarningsOnScript(BOOL);

Putting CPerlWrap into your project

First and foremost, to build a project with CPerlWrap, you need to have Perl 5.14 (or later) installed on your build machine. It is not necessary for Perl to be
installed on the target machine, but it must be on your build machine. Your target machine must have the Perl512.dll file (or Perl514.dll or
whatever you built against), so don't forget to package that up with your executable!

However, if you use a Perl package, then you may be better off with Perl installed on your target machine.

Go to http://www.activestate.com/ and download the free Windows Perl. The price is right.
Then install it. I'll wait here until that is done.

Hints and gotchas

Backslashes

The hardest part about using CPerlWrap is the backslashes (\). If you have a string that you want evaluated (interpolated) in Perl,
such as "$var1 is xyz to $var2", then that string must be surrounded by " characters and you must escape those quotes in your VC++ code:

Processes within Perl

For reasons that I have not been able to discover, this embedded Perl doesn't allow for sub-processes (note: this statement is from 2003; the situation
may have changed by now in 2012). So Perl favorites like:

just don't work! Same thing with using the backtick “`” or the system() function.
Just don't work. If anybody has a fix for this, please let me know, as it has been a source of frustration for me.

Variable scope

In Perl, the my operator is used to declare a variable in the current scope. Scope is determined, much like in VC++, by surrounding {} pairs.
The doScript() function performs a Perl eval {script} (note the {} pair) and so any variable declared with my
will not be available with the get* and put* functions; they are local to that instance of the eval. If you like to have use strict;
in your code, then you will have to define all your "global" variables using the put* functions (which puts them into the main:: module).

Using Perl modules

One of the great advantages of Perl is the long list of available modules. These are the Perl equivalent of C/C++ libraries. Modules are included using the syntax:

use CGI;
use Win32;

where CGI and Win32 are two such modules. These modules are usually included in the directory tree where Perl is installed.
Which means that using a Perl module in CPerlWrap requires that the tree be around on the target machine.

If the module in question is pure Perl (no embedded C functions), then you can copy the module (CGI.pm, Win32.pm, or whatever) to the target
machine and tell Perl where to find it with the use lib('some new directory'); pragma.

But (there is always a but), if you want a module that has embedded C functions (such as, sadly, Win32), then you will have to diddle
the xs_init() function (found in PerlWrap.cpp) and that is 'way beyond what I know about'. I have put some comments (gleaned
from the manual pages) to get you started, but I really know nothing about it. If you need such a module, start with perlguts, perlapi, and perlembed.

Update: Recent versions of Perl have better support for this kind of thing. In fact, these two commands are your friends:

Summary

CPerlWrap will probably always be a work in progress, so I will try and update this article when I make significant changes. I suspect that the greatest source of changes will be
fixes to bugs all of you have pointed out!

I don't pretend to be a perlguts expert -- everything is in the Perl manual pages and all I've done is to try and wrap
it up so that it is easy to use. See the disclaimers below.

Disclaimers

Your Mileage May Vary. Void where prohibited. Do not take internally. Not intended for ophthalmic use. Not intended for children under the age of 65.
Do not use while sleeping. Warning: May cause drowsiness. For indoor or outdoor use only. For off-road use only. For office use only. Do not attempt to stop
chain with your hands or genitals. Remember, objects in the mirror are actually behind you. This product not tested on animals. No humans were harmed or even
used in the creation of this page. Not to be taken internally, literally, or seriously.

Some assimilation required. Resistance is futile.

This product is meant for educational purposes only. The manufacturer will not be responsible for any damages or inconvenience that may result and no claim
to the contrary may legitimately be expressed or implied. Some assembly required. Use only as directed. No other warranty expressed or implied. Do not
use while operating a motor vehicle or heavy equipment. May be too intense for some viewers. No user-serviceable parts inside. Subject to change without
notice. Breaking seal constitutes acceptance of agreement. Contains a substantial amount of non-tobacco ingredients. Use of this product may cause a
temporary discoloring of your teeth. Not responsible for direct, indirect, incidental, or consequential damages resulting from any defect, error, or failure
to perform. Don't try this in your living room; these are trained professionals. Sign here without admitting guilt. Out to lunch. The author is
not responsible for any mental distress caused. Use under adult supervision. Not responsible for typographical errors. Do not put the base of this ladder on
frozen manure. Some of the trademarks mentioned in this product appear for identification purposes only. Objects in mirror may be closer than they appear.
These statements have not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure, or prevent any disease.
Not authorized for use as critical components in life support devices or systems. In the unlikely event of an emergency, participants may be liable for
any rescue or evacuation costs incurred either on their behalf or as a result of their actions. In certain states, some of the above limitations may not apply
to you. This supersedes all previous notices unless indicated otherwise.

About the Author

Comments and Discussions

The code is kinda very bad to use. Not trying to be an arse about but just telling you the truth.

1. No const correctness.
2. No usage of templates.
3. Should return a value and use a parameter for checking if it is successful.
4. Using hard coded linking to a library. Which should be in the actual project settings and not in source files. This makes it very difficult to port and it is not portable. If someone wanted to port your bad code.
5. You don't even bother with using STL at all period. Using char everywhere rather than using an actual string class to handle it properly.
6. A lot of your function names are improperly cased.
7. Function parameters in C do not need void if there isn't any parametres. This isn't C. Quit bring your bad habits to C . Just makes it worse.
8. Using Hungarian notation is not recommended anymore. We have IDEs that provide you with type information.
9. I honestly think you should rewrite this code and properly do it better this time around.

You did note that this article is from 2002, right? Standards and common practices have changed since then.

The 2/2012 minor update did add STL, as noted in the release history.

You are correct, this is not in C. It is in C++.

If I were to rewrite this now, I would definitely do this better. However, as this article was never intended to provide a finished product but rather a starting point for incorporating Perl, I don't think there is a need to rewrite. But that is just my opinion and worth every penny you paid to access this article!

CString str("this is a verylong set of words"
" that would be a pain to deal with in C++");
perlInst.setStringVal("string",str); error "string" is const char not CString
perlInst.doScript("@b = split(/\s+/, $string);"); error
CStringArray words;
perlInst.getArrayVal("b", words); error "b" is const char not CString.

also other CPerlWrap its member function occur same error above.
eg. setArrayVal

I call many times doScript().
As I try to get some good performance, I first run a script to make numerous initializations and then I run my script when needed.
Unfortunately, it seems that this consumes a lot of memory.

I tried the following:
for (UINT i=0; i<100000; i++) Perl->doScript("1;");
And in fact this will consume about ~40Mo.

There is no memory leak as this memory will be released when I delete the object.

However, this is an issue as, in my project, I call so many times doScript, that at the end I have 1Go of memory allocated

I have no idea why your program's space is increasing so much. It may be that a better approach is to write the program in Perl and add extensions in C/C++ for those things best done outside of Perl instead of the other way around.

Or, if all you are doing is using the power of regular expressions, consider using one of the many regexp libraries, such as PCRE.