Re: String Manipulation



On 6/27/07, Tom Phoenix <tom@xxxxxxxxxxxxxx> wrote:
snip
Does the data have some defined grammar, or a definable one at least?
If you are up to using Parse::RecDescent, it will probably do the job.
snip

Many people are afraid to use Parse::RecDescent because of the
learning curve involved. I find that odd given that these people
already use regexes, but perhaps an example will spur people to use
it. This is a simple parser for the strings provided. Given the
structure of the strings I have no doubt that the grammar is
incomplete (for instance, I only allow one dimensional arrays), but it
can probably be extended from here as new examples present themselves.

#!/usr/bin/perl

use strict;
use warnings;
use Parse::RecDescent;

my @string = (
"{
STACK_CC_SS_COMMON_TYPE_REFERENCE_ID_T pp_reference_id;
STACK_CC_SS_COMMON_TYPE_CM_LOCAL_CAUSE_T generic_cause;
STACK_CC_SS_COMMON_TYPE_CM_LOCAL_CAUSE_T specific_cause;
STACK_CC_SS_COMMON_TYPE_CHANNEL_INFO_T channel_info;
STACK_REG_COMMON_TYPE_RAB_RB_INFO_T rab_info;
STACK_CC_SS_COMMON_TYPE_L3_MSG_UNIT_T pp_l3_msg;
} STACK_PRIMITIVE_MNCC_MESSAGE_T;
};",
"{
UINT8 mms; /* More messages to send */
UINT8 transport_method;
UINT8 mo_rpdu[STACK_MSG_COMMON_TYPE_TF_MAX_VAR_MSG_LEN];
} STACK_PRIMITIVE_MNSMS_EST_REQ_T;
};",
"{
STACK_REG_COMMON_TYPE_REG_CAUSE_T pp_reg_cause; /* Reason
the primitive was sent */
STACK_REG_COMMON_TYPE_PLMN_T pp_plmn; /* PLMN MS
should move to */
STACK_REG_COMMON_TYPE_SIM_T pp_sim_type; /* Valid
only on BUTE */
STACK_REG_COMMON_TYPE_NW_MENU_PARAMS_T pp_nw_menu_params; /*
Valid only when pp_reg_cause is
*
_CAUSE_NW_MENU_CHANGE, _CAUSE_POWER_ON */
BOOL cingular_ens_sim_phone; /* Valid when
pp_reg_cause is SIM_INSERT */
BOOL tty_enabled; /* Valid only on BUTE. This
is valid when the reg_cause
* is SIM_INSERT,
POWERON and BANDSWITCH.
* TRUE : restrict RAT to GSM
* FALSE: Don't
restrict RAT to GSM
*/
} STACK_PRIMITIVE_MNMM_REG_REQ;
};"
);

my $p = Parse::RecDescent->new(join '', <DATA>) or die "parser error";

for my $s (@string) {
warn "could not parse [$s]" unless $p->text($s);
}

__DATA__
text: <skip: qr{\s* (/[*] .*? [*]/ \s*)*}sx> '{' statement(s) '}'
identifier ';' '};' {
our @vars;
print "$item[5]\n@vars";
@vars = ();
1; #make sure the rule returns true
}
statement: identifier identifier array(?) ';' {
our @vars;
my ($type, $var, $elems) = (@item[1,2], $item[3][0]);
if ($elems) {
$elems =~ s/\[(.*)\]/$1/;
$type = "array of $type with $elems elements";
}
push @vars, "\tdata type is $type and variable name is $var\n";
1; #make sure the rule returns true
}
array: /\[.*?\]/
identifier: /[A-Za-z_][A-Za-z0-9_]*/
.



Relevant Pages

  • Re: Building a function call?
    ... for func in allfunctions: ... unsafe practices can usually be avoided by ... remembering that functions are first class objects just like ints, strings ...
    (comp.lang.python)
  • Re: script to copy user profile!!
    ... Also, consider using XCopy: ... Specifies a list of files containing strings. ...
    (microsoft.public.windows.server.scripting)
  • Re: chkrootkit reporting sshd vulnerable?
    ... >>ha, me, an expert?, ha ha) and it gave the strings from sshd but after ... > Some of chkrootkit works by running the 'strings' command on certain programs ...
    (comp.security.ssh)
  • Re: What an effing nightmare...
    ... identifier. ... Now generates two reports, one with a .rpt extension in plain ... Unka' George [George McDuffee] ...
    (alt.machines.cnc)
  • Re: identifier class enforcement
    ... On Mon, 19 Jul 2004, Mel Wilson wrote: ... As Mel pointed out, strings are immutable, so you have to set their value ... However, as mentioned by Jeff, a function Identifier() might be a bit ... Identifier by defining setname() like this: ...
    (comp.lang.python)