2021 fall cs61a hw09

网址 https://inst.eecs.berkeley.edu/~cs61a/fa21/hw/hw09/#q3-cs-classes
problem1:了解正则表达式规则就可以写了
problem2:
第一步匹配[IVXLCDM],第二部不能匹配出现在其他单词里面的用 \b,第三步出现至少一次用+

        return re.findall(r"\b[IVXLCDM]\b+", text)

problem3:这个没什么难度

        return bool(re.search(r'cs|CS ?\d+[(a|A)|(b|B)|(c|C)]{1}', post))

probelm4:
这里有一个点,关于捕获元与非捕获元,findall在匹配的时候默认返回捕获组,像这道题目如果不加两个?就会返回捕获元 "5" "AM" "7" "",而加了之后就可以正确返回,
常见的非捕获元表示有三种 ?: , ?= , ?!,
首先第一种,它可以读取但是不捕获(或者说保存)举个例子:

    import re
    str = 'mom and dad and baby'
    a = re.findall(r'(?:mom)(?: and dad(?: and baby))', str)
    b = re.findall(r'(mom)( and dad( and baby))', str)
    print(a)
    print(b)

返回结果

第二种?=,这种和?:差不多,但是他有一个点在于,前者(?:)消耗字符,而后者(?=)不消耗字符

    import re
    str = 'ababd'
    a = re.findall(r'(a(?:b)|ba)', str)
    b = re.findall(r'(a(?=b)|ba)', str)
    print(a)
    print(b)

结果是

?:在找到匹配字符后(ab)是从第三个字符(a)接着开始的,而?=在找到字符(ab)之后从第二个字符开始(b)开始
第三种:
?!和第二种很像,但是他是不匹配才能捕获

    import re
    str = 'acacd'
    a = re.findall(r'(a(?!b)|ba)', str)
    b = re.findall(r'(a(?=b)|ba)', str)
    print(a)
    print(b)


回到我们的problem4:

        return re.findall(r'(?:[01]?\d|2[0-3]):[0-5][0-9](?:AM)?', text)

problem5:

        return re.findall(r'(?:\()?(\d{3})(?:\)?)(?: |-)?\d{3}(?: |-)?\d{4}\b', text)

    def most_common_code(text):
        """
        Takes in an input string which contains at least one phone number (and
        may contain more) and returns the most common area code among all phone
        numbers in the input. If there are multiple area codes with the same
        frequency, return the first one that appears in the input text.

        >>> most_common_code('(501) 333 3333')
        '501'
        >>> input_text = '''
        ... (123) 000 1234 and 12454, 098-123-0941, 123 451 0951 and 410-501-3021 has
        ... some phone numbers. '''
        >>> most_common_code(input_text)
        '123'
        """
        "*** YOUR CODE HERE ***"
        area_codes_list = area_codes(text)
        cnts = [area_codes_list.count(e) for e in area_codes_list]
        maxnum = cnts.index(max(cnts))
        return area_codes_list[maxnum]
posted @ 2022-04-26 13:23  天然气之子  阅读(227)  评论(0编辑  收藏  举报